A Cipher technology cluster is a grouping of patent families relating to the same technical area.
Technology clusters are based on ‘features’ or meta data of patents.
Cipher uses ALL meta-data, not just 1-2 attributes. This means that the technology clusters are as accurate as possible.
Meta-data used to create clusters includes:
- CPC codes
Note: There is a rule to show no more than ±10 clusters in a portfolio. Where you see a cluster for 'Miscellaneous', this is an indication of lots of much smaller clusters all rolled up into a larger cluster which Cipher names ‘Miscellaneous’.
How are cluster names determined?
Cipher Cluster names are machine generated, by reference to the title and abstract. The Clusters are given a name that most closely describes all patent families in the Cluster using text summarisation, and natural language processing (NLP) techniques.
The clustering and naming algorithms are completely separate, ensuring that there is no possibility of a self-fulfilling prophecy in the clustering results. If the clustering were based on the occurrence of a certain phrase then it would bias that cluster towards containing only patents that used the particular phrase, and not other closely related ones, skewing the clustering results.