Earlier this week, I finally completed the last batch of data entry work for the current phase of this project. This involved entering information for major poets connected with the "Shi to shiron" poetry journal, several of whom stand as the most prolific in terms of total output between 1920 and 1944. I now have nearly 50 poets in my database, with a timestamp for each piece appearing in any of the journals to which they contributed. Owing to time and manpower constraints, I chose to input only the year and month for each poem or essay, and did not record titles except as uniquely assigned integers. At a later phase of the project, it may well be worth including things like title, volume number, and page, but only if the intention is to create an interactive resource where someone might want to call upon that information and have it accessible.
Having completed the data entry, I then began loading it into various visualization programs (primarily Cytoscape and Gephi) and experimenting with the possibilities for meaningful display of the data. This included creating individual time slices for the two-mode bipartite graph (poets x journals) so as to better chart the dynamic and shifting relationships in the network over time. I'm also working on creating time slices based on one-mode affiliation graphs generated from the bipartite data. These are graphs which attempt to represent the weighted connections between individual poets vis-a-vis their participation in the various journals. Thus, for instance, if Poet A contributed 8 pieces to Journal Z, and Poet B contributed 4 pieces to that journal during the same year, then we can say that they are linked through Journal Z by the minimum number of pieces shared between them, i.e., 4. Were the two poets also to contribute to another journal that year, then that minimum would be added to 4 to produce the total weight of their connection. I am not yet finished with the time slices, but my hope is that it will create an insightful picture of how the poets involved in "Gakko" and "Shi to Shiron" were related to one another over time and how their career trajectories ebbed and flowed with the fluctuations of the cultural field.
In the course of doing this visualization work, which has often been done at considerable individual cost and questionable analytical benefit, a number of ideas have arisen as to how we might make the most effective use of the data we have and what fruitful avenues of exploration might lie ahead. One thing that has become apparent is the value of color-coding nodes based on different attributes. At present, my color scheme relies on dividing the poets according to their affiliation with either of the two journals being surveyed. Thus the "Gakko" poets are neon purple, and the "Shi to Shiron" poets are light blue. What was perhaps most amazing is that this delineation into two groups had an almost natural correspondence with the topology of the graph, which on its very own pushed the two groups of poets to either side of a mitochondrial-like ellipse. Those poets with the most shared ties between the two groups gravitated toward the center, yet without daring to cross into the other's territory. Here is an image of the entire dataset, from the years 1922-1944, and a closeup of that same image. The graphs themselves aren't that meaningful given the large time span, but it is easy to get an idea of the kind of grouping I am trying to describe here. Given that the layout algorithms are working with edge weights to determine the relative repulsion and attraction of the nodes, it is perhaps not surprising that the two groups of poets would end up like this, the more prolific poets gathered at the center and the more minor ones pushed to the fringes of their respective blocks.
What I would like to try in the weeks ahead is to color the nodes according to different attributes (place of birth, education level, place of education, number of pieces submitted) and see how these do or do not line up with the initial color coding. What might also prove interesting is to remove some of the more dominant nodes and consider how the 2nd and 3rd tier poets (at least in terms of output) are connected to one another.
Time permitting, I would also like to experiment with these avenues of analysis, interpretation, and data enrichment:
1) Create a timeline of two-mode graphs to better show the lifespan of the journals over the entire period and to give a better sense of the total number of active poets in any one year.
2) Create affiliation graphs where the journals are the nodes, instead of the poets, and see what kinds of groupings emerge in any one period or over the entirety of the time span. This data could also be geocoded so as to give a sense of the relation between Tokyo, provinical, and colonial journals. We would have to be careful, however, to contextualize all results as a product of just the 50 or so poets involved, and therefore not reflective of the total number of contributors to journals other than "Gakko" and "Shi to shiron".
3) Scan images of all the journal covers and associate these images with the appropriate nodes using the Cytoscape custom graphics manager. This might be worth saving for a later date, but I could also just do a few samples so as to provide a sense of the possibilities of this project as a fully interactive database that does not completely erase the material specificity of the objects being represented (ala Manovich's argument).