Monday, June 08, 2009

Practice tip: if you're seeking discovery from MediaSentry, know your metadata

For those of you who will be seeking discovery into the MediaSentry "investigation", history has shown that the plaintiffs will try to resist your obtaining anything from MediaSentry other than the printouts they intend to use at the trial.

See, e.g., UMG Recording v. Lindor, in which the stonewalling by the plaintiffs and MediaSentry is palpable.

One of the things you need, which plaintiffs will resist, is metadata.

It would be advisable to familiarize yourself with the recent decision, in the Southern District of New York, in Aguilar v. Immigration, 07 Civ. 8224 (SDNY November 21, 2008), where the Court discusses, in detail, the nature and significance of metadata.

If you are lucky enough to have a tech consultant working with you, you may want to share the decision with him or her, to see if they are in agreement with Magistrate Judge Maas's thoroughgoing treatment of the subject.

November 21, 2008, Decision

Commentary & discussion:

Keywords: lawyer digital copyright law online internet law legal download upload peer to peer p2p file sharing filesharing music movies indie independent label freeculture creative commons pop/rock artists riaa independent mp3 cd favorite songs intellectual property portable music player


Anonymous said...

To this man requesting documents in TIFF format, and databases in "native" format, is incredibly stupid on the part of the plaintiffs.

TIFF files are quite large and essentially unusable in that they are images of entire pages. Also they are difficult to create and difficult to use.

As for native database files, native files are only useful in the event that you have exactly the same database software running on your own servers as was used to create the database in the first place. Also, any database that exceeds the size of the media used to deliver it (e.g. CDs) is often a real pain to restore properly so that the proper software can then read it.

Metadata, btw for those of you not versed in Information Technology, is best defined as data that describes the data. Meta data might be as simple as a file name (e.g. song.mp3 tells you that perhaps - metadata is not guaranteed to be accurate - that the title of a musical work is "song", and that it's in MPEG Level 3 encoding), or much more descriptive data stored either in the file (like ID3 tags in MP3 files) or in a paired file.

This man is not impressed by these plaintiff's technical savvy.

{The Common Man Speaking}

Anonymous said...

Have any defendants been able to depose MediaSentry? And here I mean a MediaSentry employee who runs the machines and knows the technical details of their IP identification process.


raybeckerman said...


raybeckerman said...

That's what I am trying to do in UMG v. Lindor. But that was put on hold by their motion to withdraw their case. said...

"If you are lucky enough to have a tech consultant working with you.."

Ray, is it possible to crowd source this for people who are not lucky enough to have a tech consultant working for them?

I mean there are a lot of very brainy people out there who i'm sure wouldnt mind helping out with a few mins of their time (like a lot do on forums/news lists).

Or will that not be admissible in court? (or am I missing something?)


Eric said...

I would request database files in a "common interchange format such as csv, sql dump, or tsv" or something to that effect. "Network monitoring information such as .cap files, tcpdump logs, or other files".

Versions of the software used, source code for any custom software, names of those using the computers to track users, names of the people setting up the network and backend capture system.

How does this court ruling play on the MS evidence. ? Personally if they refuse to release information evidence to vett the "evidence" then it needs to be thrown out.

David Donahue said...

It strikes me as odd that the defendant would have to provide whole disk images of the suspected infringing computers but yet only selected data, in modified formats, is available from the computers that are reporting on that infringement.

Why would you not be able to request the whole disk image of MediaSentry's detection and analysis computers?

After all the RIAA has made and won the argument that without this whole disk data, deletions and modifications could be concealed.

It should be as true for the defendant as for the plaintiff. If the RIAA has unrelated data on those computers they want hidden (other defendant's data) they could request that specific data excluded in a "protective order for inspection of plaintiff's hard drives" like the reverse of the one in SONY BMG Music v. Tenenbaum

Who knows what interesting and Exculpatory evidence might be in the deleted files and cache areas of MediaSentry's computers?

It has the additional benefits of allowing the defendant's expert to test and verify the functionality of MediaSentry's software and providing the software necessary to access that metadata without it being potentially corrupted/rendered inaccurate during a file type conversion /extraction.

surfer said...

metadata is just that, data that describes data. comparing metadata is useless by just changing the name of the file. the true test should be comparing the hash (binary equivelant of the file in question) to the data found, if any, on the defendant's hard drive. The inspection should be limited to metadata, (search harddrive for files like '%.mp3') and if any match, then and only then should the hash (binary conversion) of the original (copyrighted version), the defendant's and the alleged MediaDefender copy. This is not copyright violation. You still have to prove actual distribution. Agents of the copyright holders cannot violate their own copyright.

Anonymous said...


Crowd source people who know tech?

Don't take your phone / pda / computer.

Anonymous anonymous

David Donahue said...

@surfer: "the true test should be comparing the hash (binary equivalent of the file in question) to the data found"

Unfortunately hashing won't match different rips/versions of a song or movie file that actually contains infringing content.

If even just a single byte is different in two files, they will have different hashes.

There are lot of ways functionally identical files can have different hashes:
-Trim a bit of near silence off the start or end of song.
-Rip the song/movie at a different bit rate.
-Change the encoding method or codec settings/selection (this makes a massive difference in file content, even though it looks/sounds the same)
-Change the file header/structure information inside the file without changing the actual audio/video data (different hash for the files but identical actual data).

In fact it is very hard for a computer to judge whether two songs/movie files are really identical since it's essentially a human/legal analysis of whether the change passes the reasonable man test of identicality and/or it has enough transformitive change to be considered different.

You just have to kind of listen to the files and say "despite the static and poor quality of the recording that is definitely the song in question". This kind of judgment is easy for humans and hard for computers, and rightly so, the law favors the human analysis.

If it didn't then I could play games all day long with file sharing app design such that no two files EVER have the same hash and unfairly dodge the law.

It might be interesting however to see if the RIAA/MPAA has argued that hash matching is the the only way to consider if a "claimed to be infringing" file matches the original content. If so that argument could be used to undermine any claim to matching content.

There are so many ways to RIP stuff and so many adjustable settings, it would be very difficult for the RIAA/MPAA to show exactly how any "potentially infringing" files were generated from the content that own.

I don't think it would be fair to make content owners figure out and show how to transform their own works into whatever file the defendant had. Listening and human judgment should be enough.

Even though it might be fun to make them try.

Anonymous said...


haha, nice one. You could also use steganography to hide data in the least significant bits of the music. -> different file -> different hash.

Of course the point of this type of steganography is that the music still sounds the same. At least a sharper ear than mine to tell the difference.

Perceptually, the song is still the same song.

Anonymous Anonymous

surfer said...

'Unfortunately hashing won't match different rips/versions of a song or movie file that actually contains infringing content.' - David

unfortunately? unfortunately??

the point is they dont even listen to the files (in court), and then don't have to prove they have copyright for that file, and that file matches the infringement claim. Offering different hashes is splitting hairs, but isnt that what the MAFIAA is doing? Present additional 'rips' with different hashes as counter evidence disproving the voracity of their heresay, evidence gathering techniques and actual validity of the evidence itself.

David Donahue said...

"unfortunately? unfortunately??" -surfer

Yes, actually unfortunately. I actually don't approve of copyright infringement and think sharing songs and movies on P2P should not be legal. It’s as inevitable as jaywalking perhaps, but not legal.

I can't believe I'm actually about to be saying this next bit

I could have had a fair amount of sympathy for the music and movie industries if the RIAA/MPAA were:
-Seeking reasonable civil penalties based on actual loss
-Investigated and prosecuted cases in a careful, legal and ethical manner
-Were very careful to only target defendants that they were extremely certain of
-Were asking for mostly small civil fines and a restraining order to block future P2P sharing

Any large scale or commercial defendants should be hit with the big guns for hefty fines, recovered revenue plus legal and investigative costs. (Minor revenue from Google adwords does not count BTW).

My point is you should not be giving away content somebody else has created. That is for the content creator to profit from. The penalty should be something like a jaywalking ticket and be a few hundred dollars (less for the truly poor).

You really shouldn't be selling it without licensing either but I think we can agree on that one though.

The RIAA/MPAA could have done this right up front. They could have sent warnings via email and then if it didn't stop, hired good private investigators to build solid cases, sending settlement offers to the provably guilty in the $100-200 range (goes to the artist for PR reasons) and asked for a reasonable and binding promise not to do more such sharing.

If the defendant was innocently or mistakenly identified (it should be very rare) they should have had a process to recheck this that would not prejudice your case (like steps your lawyer could do for you).

If you were provably guilty but yet didn't choose to settle, then after reasonable discovery to build more evidence, they would begrudgingly sue you for those same small fines plus legal costs with an open offer to settle for reasonable costs to date.

The world of P2P would sure be lot different if the RIAA/MPAA could stand on a podium with clean hands and say "hey guys PLEASE stop sharing my stuff, we really don't want to sue you but we will if you make us."

Instead of guilt trips and negativity they could offer cool new service models like:
-Free, any song ever recorded, Internet radio (with ads)
-Low cost licensing (like for broadcast radio) of the above where you can sell your own ads but have to host your own files.
-DRM free recordings for your personal use any way you want.

Instead they made a bunch of innocent file sharing martyrs and have slightly less clean hands than the Roman Empire.

There are lots of possible ways to make money here other than creating the resentment that fosters such ill will towards the record labels and MPAA/RIAA.

If you piss people off, they will find a way to hurt you.
It's a pity, even though they're losing the war, the music and movie industry could turn over a new leaf even still.

They could acknowledge the inevitable failure of their old world business model, stop suing their customers, embrace abolishing all DRM and promote global music/movie interoperability.

They could recognize that since the only real relationship is between the artist and listener/viewer, then a significant percentage of the proceeds should go to them.

The RIAA/MPAA could make money for the labels by coming up with new and better ways to promote new music to me, organizing it and getting in my hands so I could listen to it however and wherever I wanted.

The law should be about fairness and justice. The music and movie industries should be about helping their customers enjoy that music and movies.

Customers are willing to spend their money with people they like to make their lives easier, better and more enjoyable.

The music and movie industry could fill that role if only they looked forward and not back.