When Smashwords signed a deal last December to supply ebook subscription service Scribd, many Smashwords authors were concerned over possible piracy problems.
Scribd was set up six years ago as a document upload service and now hosts 50 million documents and books, with thousands of new uploads every day, and gets around 80 million views a month from around the world. But the site has seen a lot of copyright abuse where work has been pirated and uploaded.
Smashwords says several its authors who had discovered unauthorized versions of their books at Scribd questioned why the firm would partner with Scribd.
After a meeting in January between Scribd executives and several concerned Smashwords authors, Scribd released a major update to their copyright protection system.
The copyright protection technology is called Book ID and scans all Smashwords-delivered books to analyze the text for semantic data such as word count, letter frequency, phrases and other elements.
It creates a digital fingerprint of the authorized Smashwords book and uses this to automatically detect and remove unauthorized versions. It removes all files at Scribd that match the same fingerprint and blocks the upload of future unauthorized versions. Scribd says no copyrighted content is stored or made available to the public by BookID.
As of January 9, BookID had detected and removed 3,745 book files from Scribd, representing 1,725 unique Smashwords books.
In March, Scribd released a new version of BookID, which has dramatically increased the ability to detect unauthorized versions. As of this week (starting May 12), Scribd says BookID has removed 47,858 unauthorized copies of 14,090 unique Smashwords books.
Smashwords authors now supply 225,000 titles to Scribd, so the number of unique books pirated represents around 6% of the total number of Smashwords books, although the number of unauthorized copies, presumably multiple pirates, represents almost 20% of the total.
You can learn more about the BookID technology at Scribd.