Not All TIFFs Are Created Equal
Processing of electronic discovery data can lead to interesting surprises in terms of the complexity and/or size of the data. This can sometimes make it challenging to accurately estimate a timeline for a project prior to loading the data and performing some preliminary analysis.
For example, we recently received 20 spreadsheets that needed to be converted into TIFF images and produced to opposing counsel. The client called and asked us for an estimated time to complete the project. Based on the fact that it was only 20 spreadsheets, we estimated that we would have this project completed within a few hours. Assuming 50 pages per spreadsheet, our estimate was that this was going to be about 1,000 TIFF images.
After we received the data, we loaded it in our system and created TIFF images of the spreadsheets. It turned out that the 20 spreadsheets generated close to 100,000 TIFF images or pages (an average of 5,000 pages per spreadsheet). One spreadsheet converted into approximately 20,000 TIFF images. This meant that the data size was almost 100 times bigger than we had expected. As a result, the project took longer than our original estimate. The good news was that most of the spreadsheets actually had a lot of blank pages and other “quirky” formatting issues. In the case of the 20,000 page spreadsheet, we were able to fix the formatting (without, of course, changing any of the original data) which reduced the spreadsheet to a few hundred pages. We were also able to significantly reduce the page size for the other spreadsheets by a similar amount. The additional time that we took to fix the formatting ended up saving counsel countless review hours and cost.
Bottom line, when requesting a firm timeline and cost estimate from an electronic discovery vendor, it is always best to give them the actual data and request that they do a preliminary analysis of the data prior to finalizing an estimate. This will insure a much more realistic estimate.