Responses of open ended questions

The coding of reponses to open ended questions is very time consuming and therefore expensive. An automatisation of this task is possible with homogeneuous answers, although in the beginning it seems that the required effort is high.

If the text of the responses is available, it mus be prepared for a processing with TextQuest. Using statistical software (e.g. SPSS), the text is written as a plain text file and can be read with the fixed format of TextQuest. Additionally to the text, external variables must be considered, these are necessary for the merging of the numerical results of the coding with the rest of the data set.

The values of the external variables and the text always take the same columns on a line if you use fixed format. Each line consists of one answer to one questions. The external variables must be at the beginning of the line, the last item must be the text.

After the generatio of the system file, a categorysystem must be developed for each questions that is going to be analysed. If there are e.g. 5 open questions, also 5 category systems are required. The development of the categories is not done by TextQuest, but TextQuest supports this taks with a few modules:

  • a word list
  • a word sequence list
  • a word permutation list
  • KWIC (key-word-in-context) list
  • SIT (search pattern in text unit) list

The coding can be performed if the category system is complete. The file that contains the counters for each category per text unit is appropriate for further statistical analyses, TextQuest generates a setup (syntax file) for SAS, SimStat, ConClus, or SPSS. You can control the coding process by checking the file of uncoded text units, this should be as small as possible (< 5%) or empty. An iterative working technique consists of working with a few search paterns in the beginning and using the file of uncoded text units to refine and extend the category systems with the goal of minimising the file of uncoded text units. Because the coding process is a matter of a few seconds/minutes only, there are as much codings possible as you want.

The last step consists of aggregating the data gained by TextQuest with the original data by adding the new variables (join files in SPSS).