Thursday, January 26, 2006

Calling All Techs!!! RIAA Defends Its "Investigation", Says Metadata Shows Illegal Copying

In its opposition to defendant John Doe Number 8's motion to vacate the RIAA's "ex parte discovery order", filed today in Atlantic Recording v. Does 1-25, the RIAA has submitted a declaration of Jonathan Whitehead -- an RIAA Vice President -- that the metadata in John Doe Number 8's shared files folder shows that illegal copying took place:

Declaration of Jonathan Whithead
Exhibit A
Exhibit B

This would appear to be in contradiction to the earlier affidavit of computer programmer Zi Mei in support of the motion.

Any input from the tech community would be of interest. The response to the Whitehead declaration is due February 7th.








Keywords: copyright download upload peer to peer p2p file sharing filesharing music movies indie label freeculture creative commons pop/rock artists riaa independent mp3 cd favorite songs

16 comments:

CodeWarrior said...

Although I am not a lawyer, nor play one on television, much of the testimony of seemed not so much based on hard science, as hearsay and speculation of the grossest nature.

Just having a file that is marked as a "shared folder" that has a large number of files, proves nothing. I could create a folder called "My Shared Folder" and copy a thousand files from my hard drive to it, and it would not be prima facie evidence that the files were per se, copyright infringements.

Of course, if they are using the DMCA as their legal authorirty, they ned to provide proof of copyright registration to sue under the DMCA.

Having metatags that says "Jazzy D ripped this from So and So's album" is hearsay of course without conclusive proof of such unauthorized copying.

I noted tinfoil's comment (shout out to Tinfol, how's it going my friend ) and agree with the poor quality of proof aspect.

In closing, I noted the "signature" of Jonathan Whitehead. It looks ambiguously either like a poorly formed star, which this wannabe star may actually aspire to be, and hence his association with the RIAA, or it could be a giant "A". I wonder what a giant "A" could stand for, given this gentleman's personality.

~Code

CodeWarrior said...

One further comment. In looking at the programmer's testimony, and the silly exhibits A and B which Jonathan Whitehead presented, it is clear to me which has the greater convincing weight, and it is certainly NOT the man who signs his statement with a Giant Start / Giant A.

The difference in expertise is like trying to decide who knows more about the construction of light bulbs, Charlene Tilton, or the chief engineers at Westinghouse.
~CW

Larry Rosenstein said...

I agree with Chris MacDonald. Zi Mei describes the technical aspects of metadata, including that it is optional (does not affect playing the tracks) and easily modified.

It becomes a legal question whether having tracks with a lot of different metadata characteristics is proof of infringement.

An analogy might be someone who has a collection of books with hand-written notes in the margins, all with different handwriting. Is that proof that the books are stolen?

recordjackethistorian said...

The only thing that is clear is that this is a list of *someone's* songs. Lists like this are available quite freely and having a list doesn't mean he had the material on his computer. If I have a Ikea Catalogue does that mean I have Ikea furniture? Of course not.

The other thing that is striking here is that theses are indeed ID3 tags, but Windows users don't always use these tags. The custom for windows users is to use the file name as metadata instead of the MPEG layer 3 audio tags. Some people are assiduous about these things, others don't care.

Johnathan Whitehead betrays his ignorance of how things really work by making assumptions which any experienced computer user would know.

I'm a Mac user myself but I know these things about how my Windows using friends usually do things. This is certainly not it. Where is the list of files actually found on his computer? That might be more convincing than a catalogue of a collection from an unknown source.

I have a desert dish with a piano keyboard border on one side from a restaurant in Montreux Switzerland, does that mean I've been to Montreux? Could you accuse me of stealing such a plate because they do not ordinarily sell them? In actual fact, it was a gift from a friend who was traveling there and obtained it legally.

As has been pointed out above,"file generated for user ", seems very suspicious to me. A user name would be the standard way of addressing a user of the system. It would be a trivial process for the RIAA to have created this document themselves. Its a list, nothing more. It proves nothing.

Good luck with this!
David
http://recordjackethistorian.blogspot.com

silencer said...

As a former help moderator for edonkey.com and overnet, both filesharing programs, this is my input:

The RIAA and MPAA has thousands of people flooding the internet's peer-to-peer networks with dead mp3 and movie files - in other words they may be labled as you show here, but there may not actually BE a music waveform that is playable as the music it supposedly represents via the label.

A user can download thousands of files that appear to be copywrited songs, have the right ID tags and length yet only contain static. I had heard that this practise (file faking) is illegal in the USA, as a form of fraud, but that is another story. The other issue here is that a person may attempt to download music, then cancel. The log will still show the attempt, and this may be all this list is, if these even ARE music files.

The ONLY way to prove infringement is if 1) no licence was ever purchased (has this person ever owned these songs on cassette or CD, if so then she has a license and can have a digital copy) 2) they can produce a VALID WAVEFORM. IF they can show a perfect waveform match between the song on the harddrive, and a waveform from a copywritten CD, then you have a fingerprint match, and can start asking how that waveform (playable as the actual song) got there.

No waveform, no proof there was ever a song transfered.

I can give you an empty bag and write 'cocaine' on the bag, but it is still is just an empty bag no matter what 'list' or metadata it is recorded on.

Alex H said...

As you guys have already established, it is possible for two people to create identical files using the same ripping program with the same settings.

If a music file is popular on p2p networks and someone makes their own copy from a CD they bought, putting that new (identical) rip in a shared folder will make the p2p client (e.g. LimeWire) record the users as having a file with the SHA1 hash XXXXetc.

When another user searches for a key word like the title of the song, both the old and popular file and the new file will identify as one and the same to the searcher.

Once you find another "source" on a p2p network, you are able to download metadata from them. Some p2p clients do this automatically.

So yes, it is possible that a user could have their metadata filled in automatically, just by having their own legally ripped file in a shared folder.


As far as the Kazaa/LimeWire thing goes, they are completely separate programs and networks: A Kazaa user can't download from a LimeWire user and a LimeWire ser can't download from a Kazaa user.

Kazza uses the FastTrack network to communicate with other Kazza users.

LimeWire uses the Gnutella network to communicate with other LimeWire users.

This is a very basic concept and anyone who doesn't know the difference between these clients and networks can't claim to have any knowledge of modern peer-to-peer technologies.


In case this ends up coming up in court somewhere:

1) The Gnutella network is open source - anyone can code a client that connects to this network, provided they follow the published specifications.

2) As the original code Gnutella code was released under the General Public Licence, anyone using it as a base to work from would have to follow the conditions set out in the GPL.

3) Anyone not following the GPL in writing their software would (obviously) be breaking the licence under which Gnutella was released.

4) If I remember correctly, GPL-breakers forfeight their right to use the GPLed code.

5) The vast marjority of developers in the p2p community in general and in the Gnutella community specifically would never endorse a "closed source" client as a well written application.


To sum up those points, it's possible that the RIAA operatives have no legal right to use any software they've built to track people's downloading habits, and even if they do, they would need to display the code publically for anyone to take the software's results seriously.

Alex H
www.techlovesart.blogspot.com

Alex H said...

I have a few things, so I'll try to brak them up:

First point - Supernodes

To start with, are we talking about supernodes or ultrapeers? "Supernode" is a term used on Kazaa's FastTrack. "Ultrapeer" is a term used in conxtext with Gnutella. They basically do the same job, but there are obviously some differences considering that they are components of completely separate networks.

A Gnutella ultrapeer acts like a temporary server and acts as a clearing house for search queries made by users (such as a keyword search). Your PC may be "chosen" by the network to act as an ultrapeer (provided your PC is capable of handling lots of search data) or you can elect to make yourself an ultrapeer.

As a serch query is passed from a user to an ultrapeer, which passes the query on to other (regular) nodes, the ultrapeer basically gets to see what you're searching for.

This has been exploited by network spammers - as soon as you send a query for "A really cool song.mp3" the spammer uses that info to return a result for the exact thing you searched for. The spammer will have set up a number of nodes to act as sources for the spam file, which will return a "Yes! I have "A really cool song.mp3" message directly to the user.

Basically they rename their spam files (like virii) on the fly. Users are tricked into thinking the spam file is the file they want and download it.

So yes, it is possible to hack up an ultrapeer to find out what people are searching for. This does not mean the ultrapeer is aware of what people actually download, it just means the ultrapeer knows what people are searching for. Whether they find it and download it or not is a completely different matter.


Second Point - Validity of evidence

Without knowing exactly how RIAA has been collecting their evidence, that evidence could never be called anything but suspect.

It is well known that the RIAA (through companies like MediaSentry) pump fake files onto the p2p network. They obviously have some skill in that area, but for a company that makes their money through deceiving people, I personally would be skeptical of any information they present.

I can't emphasise this enough: in the p2p world, nobody takes your claims seriously unless you can prove the result is accurate by showing off the source code.

The RIAA saying to a court that they have "evidence" is like someone saying "trust me, I'm a doctor" - nobody could take them seriously without getting a second (or third, or fouth) opinion from an independent source. The RIAA will need to get their methods validated by someone other than their subcontractors.

It is quite possible that they "got the right guy", but by using dodgy information gathering that would be by pot luck rather than "evidence" by any legal standard.

This leads to my next point:

GPL issues

As far as I know, it is flat out illegal to knowingly break a licence agreement like the GPL - you give up any sort of right to use the software code.

I'm not a lawyer, but I would be interested to see what the RIAA's position is if it turns out they have been gatheirng "evidence" with tools they have no right to use (kind of like a private investigator who breaks into your house to snoop around).

Lastly,

Metadata

Metadata literally means "data about data" or "information about information". There should be no expectation that one piece of data (like metadata) is any more reiable than another (like a file).

We know that files can be unreliable because the RIAA themselves have demonstrated it by creating fake files, so it is just as likely that the metadata is inaccurate too.

I really don't know how you can prove that a file transfer took place "because the metadata said it did". Metadata doesn't contain things like "Transfered from IP 185.678.etc.etc. at 10:57PM".

Metadata is information about a file's contents, not it's history.

On the whole, metadata should be regarded as:

* Easily faked
* Unreliable
* Not capable of showing a history


If you want to get really technical, head over to the FrostWire IRC channel (www.frostwire.com) or the LimeWire IRC channel - they are both filled with the people who actually write the gnutella code and they'll be able to tell you in more detail.


Cheers,
Alex H
www.techlovesart.blogspot.com

Alex H said...

I still can't work out how they got that "UserLog".

As far as I know, LimeWire doesn't provide anything like that, and the "Matching files" stuff at the top is certainly something that's been added in by the RIAA investigators.

Whithead says nothing about how the RIAA actually obtained their "evidence", except that they apparently did.

Alex H said...

By the way, so far I've found 7 of the songs listed in Exhibit A on Bitzi.com, a free metadata database.

Alex H said...

Oh, and I also manged to create my own mp3 rip with the same SHA1 as another track that is quite popular on the Gnutella network. The source was a CD I bought in Barnes & Noble, which has an "Imported" sticer on it and I don't know which other countries got the same imported version of the CD.

I found the ripping and encoding apps used (from the metadata) on the existing file and used the same versions of those apps to rip and encode my own mp3 file. All the metadata was available for the (existing) file, so I just used it to rip and encode with the same settings. It was pretty easy actually.

I moved my new rip into a shared folder and got my p2p app to "aquire metadata". It searched the network for a bit and now my new rip has all the same metadata attached to it as the existing file that was already on the network.

Just in case anyone is interested.

Tsu Dho Nimh said...

I have some comments on the declaration:

1. He claims to be able to be able to "competently testify as to the facts" ... but unlike any expert witness deposition I have seen, does not detail what makes him competent about those facts. Unless he's a geek, with a specific area of expertise, he's not competent. He's just a suit. Get the "KErnighan" deposition in the SCO vs IBM case for an example of a real expert's deposition.

2. He claims ot have personally "supervised, directed or reviewed" the results ... what does that entail? Did he REVIEW these results, or just tell a flunky to get something.

3. What was their PROCESS for reviewing? Did they test to make sure it didn't get false positives?

4. Is there any innocent way that this folder could have been shared? If Doe8 was using MSFT, I think it shares with world+dog by default.

5. How is the "sheer number of files" indicative of anything except that there were lots of files?

6. How were the files they downloaded kept? Is there a chain of evidence, and secure, digitally signed copies of the download, or could any over-eager flunky with a meta-tag editor get to them?

7. They "undertook an expedited review" of the files IN RESPONSE TO the motion. Had they reviewed them before then? What were the results of the preliminary review?

8. The broad range of software and comments does tend to indicate that Doe8 was DOWNLOADING somgs, which might n ot be illegal according to USC-17 (despite what the deposition says), but they have to prove he was making them available for download, deliberately.

Tsu Dho Nimh said...

Zi -
It may be true that the RIAA is not a government, but they MUST be able to explain exactly how they got their evidence, and show that it could not have been falsified or altered while it was in their posession.
What was their method, what software did they use, etc.

"I moved my new rip into a shared folder and got my p2p app to "aquire metadata"." And this is what a LOT of people do, because typing all that metadata is a pain in the butt!

Bacardi said...

What effect will GhostSurf have on the internet addresses gathered by the RIAA?

I am not a user of p2p so I cannot test this. I do use GhostSurf but I don't know how effective it is.

Alex H said...

@ Zi

Regarding the Bitzi.com metadata, I had some issues and can't find the links I made to the files.

This however is the link to one of the files listed near the top of Exhibit A:
http://bitzi.com/lookup/VYKFND5TDSEF6XOGIIN75PM6KB2NDX2F

The last bit if the URL is the SHA1 hash you can go through the list pretty quickly by just copying the SHA1 hash from Exhibit A and substituting it in the URL.


Regarding the SHA1 duplication, buy two copies of the same CD. Rip and encode one of them, then use the same tools to rip and encode from the second one. You can try it on two different computers if you like, but provided you use the same tools on the same settings, there is a good chance you'll create two files with the Same SHA1 hash.

raybeckerman said...

Great job, guys!!!!!

On behalf of all those folks out there being pushed around by the RIAA, thanks for your help in fighting back.

raybeckerman said...

By the way, if the judge sets an oral argument date, I'll post it in the directory of upcoming court dates. It would take place at 40 Centre Street in Manhattan.

Court proceedings are open to the public, but be sure to allow extra time to go through the metal detectors.