Comments: Press Release

Great news, Matt. I'm curious as to how much cog sci is brought to bear on this project. It's quite clear that the search for "full patterns of text" is, in essence, the search for full patterns of thought or, at least, expressed thought. How will issues of materiality be dealt with here? Are minings going to be conducted across media, withing media sets or without concern for media? Regardless, this has me quite excited. Congratulations.

Posted by Marc at October 12, 2004 12:49 PM

Marc,

What makes the project unique, I think, is the rich descriptive markup that characterizes humanities text collections. While there's been a lot done with plain text data mining in other domains, here we'll be able to take advantage of the markup denoting formal, material, historical, and other aspects of the texts. Stay tuned!

Posted by MGK at October 12, 2004 01:16 PM

Congratulations again Matt. Very exciting stuff.

Posted by Jason at October 12, 2004 01:27 PM

That's what I assumed was happening-- I was just interested, I guess, as to how these markups would be segregated. As you know quite well, this is a huge part of the process. I was simply wondering what the categorizations might be.

Posted by Marc at October 12, 2004 01:53 PM

Wow, what great news, Matt! Congratulations. These tools you'll be developing -- will they be applicable to primary sources in languages other than English? I'll join Marc and undoubtedly others in the excitement.

Posted by vika at October 12, 2004 08:24 PM

Wow, what great news, Matt! Congratulations. These tools you'll be developing -- will they be applicable to primary sources in languages other than English? I'll join Marc and undoubtedly others in the excitement.

Posted by vika at October 12, 2004 08:47 PM

Thanks for the well-wishing everyone. Vika, several of the repositories we'll be working with have substantial holdings in non-English texts. Marc, the kinds of decicions about materiality and categorization to which you refer will ultimately be made by the user, not the software . . .

Posted by MGK at October 13, 2004 08:43 AM

Again, I was only referring to the tagging that each text would receive. Tagging is a sorting, I think, that is quite different from that of front end user decisions.

Posted by Marc at October 13, 2004 11:01 AM

We're not going to be doing any tagging ourselves. We'll be working with the existing collections of the institutions listed in the announcement.

Posted by MGK at October 13, 2004 11:10 AM

Got it. Finally.

Posted by Marc at October 13, 2004 11:15 AM

Guess I should probably have read the description a bit more carefully, eh?

Posted by Marc at October 13, 2004 11:23 AM

Well, you can keep us honest ;-)

Posted by MGK at October 13, 2004 08:42 PM

This is *very* exciting stuff, Matt! Congratulations to you and all the participants.

Posted by George at October 14, 2004 11:48 PM

Thanks again, George, everyone. Nice to come home after a trip, clear away the comment spam, and find this under the accumulated crud.

Posted by MGK at October 17, 2004 07:38 PM