DigHumNotes: Rethinking the Database

After a meeting with our local humanities computing director, I decided that it would probably be worthwhile to begin recording the individual submission data for the poems in my database. So rather than simply count the number of instances in a particular journal, the idea would be to track every single submission as a separate and distinct edge (or link). Adopting such an approach should also make it easier to create dynamic network visualizations that represent change over time.

With this in mind, today I began to work with a very small dataset to test how the workflow might look when going from the kind of database described above (with each poem constituting a link) to any of the SNA analysis and visualization tools. I had little trouble loading the edgelist into UCINET or ORA, and both programs automatically translated the multiple links between a poet and a journal into a link weight. The problem, however, is that both programs did this by reducing the connection to a single line, which then rendered irrelevant the data for each poem (i.e., title, date). I can imagine adding this information as an edge attribute in a program like Cytoscape, but given the editing limitations, this could mean a lot of labor intensive input. For while I can easily bring up a list of edges and add an attribute, I can't seem to sort this edge in the same way as my database. And I'm still stuck with the problem of having all submissions to one journal condensed into a single edge. Is there a way to prevent this from happening?

Update (8/26): Soon after writing this post I discovered that Cytoscape can be used to import the edgelist from a .CSV file, and that rather than condensing the edges into a single weighted line, it preserves each and every one as a distinct link. This is extremely useful for me, though the graph itself (at least for the total time span) is obviously quite messy.

DigHumNotes

Monday, August 1, 2011

Rethinking the Database

No comments:

Post a Comment