After making some experiments with using stackoverflow data, I wonder how these models work with different corpus. Is it a good idea to predict tags from body with a model which is trained by titles? I used models from this post and I used a simple methodology for this experiment:
1- Used already trained model
2- Apply model for test data except itself, for example title model file, apply this model to body, title + body, title + body + comment
- I: Title only
- II: Body only
- III: Title + Body
- IV: Title + Body + Comments
|P@1||Model I||Model II||Model III||Model IV|
|Test Data I - 0.491||0.405||0.425||0.415|
|Test Data II - 0.478||0.399||0.484||0.487|
|Test Data III - 0.507||0.436||0.518||0.522|
|Test Data IV - 0.506||0.421||0.512||0.515|
|R@1||Model I||Model II||Model III||Model IV|
|Test Data I - 0.212||0.175||0.184||0.179|
|Test Data II - 0.207||0.173||0.209||0.21|
|Test Data III - 0.219||0.189||0.224||0.226|
|Test Data IV - 0.218||0.182||0.222||0.222|
1- Title only model (I) is trained by less content than others so it doesn't work with test data which has more content, it gives 7% less precision accuracy and 3% less recall accuracy. Model I is trained by 407,596 words which is almost 6 times less than others at best case.
2- And also model II, III and IV gives worse precision and recall accuracy for the test model I for the same reason.
3- 1 and 2 shows, content size of the training is so important for the classifier accuracy. It's not a good idea to train a model with larger content and predict data with less content, vice versa.
4- Model II is trained by 3,963,237 words, Model III trained by 4,105,156 words and Model IV is trained by 5,431,988 words. And Model II gives better results for Test Data III and IV. This is kind a surprising for me.
5- Model III gives better results for Model II and IV.
6- Model IV gives better results for Model II and III.
7- Model III and IV give better results for Model II and I think it's because body dominates the data and comments give good information about tags also.
8- I wonder ws (content size) effect on this data, like building Model IV again with ws = 1 and test this model with Test Data I.