What is a Training Set?
A Cipher Classifier is a machine-learning algorithm which is constructed based on a training set: a dataset with which you 'train' the classifier to learn the parameters of your data query.
This is done by identifying both 'Positive' and 'Negative' patent examples.
Positive = examples of what you ARE interested in finding
Negative = examples of what you are NOT interested in finding
The more examples you have of either will help the machine to 'learn' what it is you're interested in finding.
How do I populate the Training Set?
You can use any of the following features in the Cipher Classifier builder:
- Upload patent numbers: this could be a list from previous searches, or examples from different portfolios.
- Global Search: type a simple search string specifying e.g. what you are NOT interested in and mark results as 'Negative'. Or, search for a company whom you know own patents in the technology of interest and mark relevant results as 'Positive'.
- Suggestions tab: this list will start to populate once you started adding 'Positive' and 'Negative' examples. Use this list in conjunction with the other two options above to populate your training set.
When do I click 'Build Classifier'?
You can start building the classifier as soon as you have at least 5x examples of Positive and Negative patents. Remember to click 'Build Classifier' as you continue to add examples to your training set. This will recalibrate the 'Score' you see for every patent and will give an indication of how much more training is required.