Designing Your Catalog Matching Program
To design a program that adds Rovi IDs to your catalog of music, movies, or television shows, you'll want to take advantage of two major features of Catalog Matching API responses:
- A relevance level rating between 0 and 1 for each match returned.
- The variety of data that can be returned with each match so you can, when necessary, manually confirm any match.
The API returns the most likely match first—the match with the top relevance score. Relevance scores generally increase by following these guidelines when you construct requests:
- In the name parameter, include the full contents of the field that contains the name of the file, song, show, album, movie, person, or group. Don't use just keywords. Even punctuation, such as a comma, may increase the relevance score when similar keywords exist in multiple records.
- Present as many parameters in the request as you have data. Every parameter you supply helps increase the relevance score of the match you want when there are other close matches.
Given all that, however, the relevance score of the top match is not the most important indication that the top match is the correct match.
The most important measure of confidence in the top match is the difference between the relevance scores of the returned matches. The greater the difference between the top match and the second match, the more certain you can be that the top match is the correct match.
So at what difference can you have confidence that the top match is the correct match? That can vary with catalog and data, but we suggest starting with a value of 0.1. Flag a result for manual determination where you find a difference of less than 0.1 between the top two scores. With experience, you may decide to increase or decrease that.
Prepare Your Database
Decide which fields you want to use and, if necessary, add them to records in your database.
- Add one field to each record for the Rovi ID. Rovi IDs are alphanumeric characters.
- Add a field for a confidence level value between 0 and 1. Confidence level is the spread between the relevance levels of the first and second result.
Design Your Program
Here are design guidelines for a program that uses the Catalog Matching API to capture Rovi IDs.
- Offer a user menu. Consider the following options:
On a menu selection, execute a search.
- Run with no verification (first pass)
- Run with manual verification (first pass)
- Manual verification of low-confidence results (second pass)
- Manual verification of a particular Rovi ID (second pass)
Call the match request that's appropriate for the catalog and record.
- For a first pass, begin a loop on a search for an empty Rovi ID field.
- For a second pass on a particular ID, search for the ID.
- For a second pass on low-confidence results, begin a loop on a search for a low confidence value. We suggest starting with a confidence level below 0.1. With experience, you may want to adjust that value.
On returned results, grab the following results:
- To get the best match, add all of the parameters in the request that have data.
- For a pass with a manual verification, also add the include parameter to request verification data. For best results, choose data that is most appropriate to verify a match.
For manual verification, do the following:
- The Rovi ID you want from the first result. Save the ID.
- The relevance level of the first result and the second result. Save the difference in the confidence level field.
On a loop, repeat or end the loop.
Return the user menu.
- Present the Rovi results and your data, along with controls that allow the user to scroll the Rovi results and select the correct ID.
- Grab the verified Rovi ID from the data and save the ID.
- In the confidence level field, indicate a manual verification. Examples: empty the field, write an particular character, or write an unusually large confidence level value such as 0.99999999.