Past Events

Special Colloquium

World-Historical Dataverse Colloquium, 27 March 2012

A workshop on World-Historical Dataverse vision and research. Click on the link above for details on the March 2012 Colloquium.

World Historical Dataverse Workshop, 23 February 2011

This workshop was to identify transformational research issues in large-scale historical datasets; select specific deliverables; build community of collaborators. Click on the link above for details on the February 2011 workshop.

Monthly Workshops at Pitt

Report on December Monthly Workshop at Pitt

Tuesday, 6 December 2011
3:00 – 5:00 p.m.
Present: Geoff Bowker (iSchool), Ryan Champagne (MLIS), Nancy Condee (Slavic & Film Studies), Diego Holstein (History), Cory Knobel (iSchool), Courtney Loder (MLIS), Pat Manning (History)

Speaker: Pat Manning (Director, World History Center).
“Social Science Theory: Can It Be Unified?”
Objective of this analysis is to locate the underlying unity, similarity, and connections within social science theory. This objective comes out of the project of constructing a world-historical dataset, intended to provide consistent data on human experience and activities for all parts of the world, aggregable up to the global level. Linking data into an overall picture will be possible only if  the theories underlying the data have some consistency. This is a preliminary and exploratory analysis, intended to reveal a range of possibilities.

The analysis began with a broad vision of describing and analyzing global social change, in an attempt to give a balanced description of such issues as life and death, governance at multiple levels, social and economic processes, processes of learning, migrations of individuals and communities, elite and mass interactions, and the exploitation or renewal of the natural and biological environment. All of these factors are commonly analyzed through distinct disciplines, each with its own theory.

Possible avenues for locating parallels in social science theories include:

  • Identifying common variables used in multiple theories (e.g. time, distance, population)
  • Identifying common dynamics used in multiple theories (growth, maturation, cycles, diffusion, equilibrium, entropy)
  • Identifying parallels in the social institutions treated in various theories

Issues raised in discussion of the presentation included micro and macro scales of theory; the contrast of thinking and the underlying phenomena; cohort effects including earlier efforts to unify social sciences; periodic consensus on analytical metaphors and styles; and the many meanings that may be attributed to a given variable.

Click here to listen to the December 2011 talk
(.mp3 format [to save the file, right click on the link and "save link as"])

Report on November Monthly Workshop at Pitt

Tuesday, 1 November 2011
3:00 – 5:00 p.m.
Present: Dan Bain (geology), Geof Bowker (information science), Elizabeth Campbell (history), Nancy Condee (global studies), Diego Holstein (history), Yongxu Huang (public health), Ahmet Izmirlioglu (history), Hassan Karimi (information science), Pat Manning (history), Carlos Sanchez (arts & sciences), Xi Zhang (sociology).

Speaker: Rane Johnson-Stempson (Principal Research Director, Microsoft Research Connections).
While Microsoft Research works on blue-sky research with a 6-to-10-year time frame, Microsoft Research Connections works from 3 to 8 years out, focusing on external academic connections, focusing on health, environment, computer science, human-natural interactions, and the educational and scholarly community. Its foci include ChronoZoom, Women in Research, technology and Sex Traffic, and community capability.

ChronoZoom is linked to the Gates Big History Project, which is to provide Big History curriculum for secondary students. ChronoZoom, in its forthcoming beta version and subsequent versions, is to become the comprehensive, accessible tool for access (by students, teachers, and general users) to information on the cosmos, earth, life, and humanity. ChronoZoom focuses especially on historical data, and the World-Historical Dataverse project aims at developing and organizing historical data.

The World-Historical Dataverse project is invited to become a partner in the ChronoZoom project, especially by providing data to the group led by Walter Alvarez at University of California, Berkeley, which will organize and analyze the data for output to Big-History viewers. Microsoft Research Connections, coordinating this and other partnerships (as with CERN, WHO, publishers) is emphasizing visualization at this early stage, followed by a focus on creating a real-time, cloud-based authoring tool, to be completed by September 2012.

In the discussion, Dataverse group members grappled with the scale and pace of this project, but expressed enthusiasm for joining the partnership by collecting and sharing historical data. In later discussions it was agreed that Dataverse group members will prepare comments on each ChronoZoom release and submit data as our stream of incoming historical data grows. Continuing this cooperation among Dataverse and Microsoft Research Connection seems likely to benefit each partner.

ChronoZoom introduction video:

Click here to listen to the November 2011 talk
(.mp3 format [to save the file, right click on the link and "save link as"])

Report on October Monthly Workshop at Pitt

Tuesday, 4 October 2011
3:00 – 5:00 p.m.
Present: Dan Bain (geology), Geof Bowker (information science), John Clark (information science), Karl Grossner (speaker – UCSB), Stephen Hirtle (information science), Diego Holstein (history), Pat Manning (history), Christina Robles (information science), Carlos Sanchez (Arts & Sciences), Vladimir Zadorozhny (information science).

Speaker: Karl Grossner (Univ. of California – Santa Barbara)
Karl worked from an unpublished paper, “Towards an Emergent Spatial History Ontology,” which in turn draws on his 2010 UCSB dissertation in Geography, “Representing Historical Knowledge in Geographic Information Systems.” His presentation summarized his ontological work and offered recommendations on steps for the World-Historical Dataverse project.

Framework. Karl emphasized “an approach that is ontology-driven, information-based, and event-centered – modeling not reality but people’s knowledge of reality.” The framework includes four temporal entities: event, activity, process, and state. Events are particular occurrences, individual or composite, with varying time spans; they may be subdivided or aggregated without limit. Activities  are human practices; they are constituents of events. States are non-event data created from observations of the human record. Processes are series of state changes, linked by events, artifacts, or causal explanation. Overall, this framework is designed to support an event-centered approach to classification.


Above is an illustration to illustrate how relationships among persons, things, and places are a function of (or mediated by) events.
How to build an Ontology
Karl argued that, to develop a comprehensive ontology for the World-Historical Dataverse (WHD), the project should accept “upper ontology” (i.e., general categories) as it has already been developed, notably in museum studies (DOLCE and CIDOC)—then expand and modify it to fit the needs of historical analysis.

For ontology in the more specific world-historical domain, we can compare this ontological work with that in medicine, where details of the domain ontology have been established by putting numerous experts together to hammer out a common program. The costs and the required levels of individual expertise for this approach are very high, however. For historians, it will be necessary to find ways to develop the ontology with smaller teams. Crowd sourcing and a Wikipedia-like approach may be a productive approach.

In his historical-geographic ontology, Karl began with DOLCE at the upper level, modified it somewhat, and then filled in the domain level by mining and organizing the terminology of historical atlases. He proposed equivalent steps for creating a world-historical ontology for the DVN.

World-Historical Dataverse vs. Dataverse Network
How is the World-Historical Dataverse (WHD) different from the Dataverse Network (DVN, The answer in general seems to be that the differences are in the services provided by the WHD version of the DVN. That is, the WHD version will set criteria for ingestion, merge data, search, download data, analyze, and visualize.

So the WHD is best seen as a portal that facilitates services and tools to serve a world-historical research agenda. Working from the WHD research agenda (as defined in the May 2011 draft proposal for a Science and Technology Center), we can propose four arenas for the research agenda, as follows (where each of these topical arenas is explored over space, time, and scale):

  • Social & natural interactions                                                       
    e.g. demography, disease, environment    
  • Governance
    e.g. social uprisings     
  • Development
    e.g. levels of output and lifespan 
  • Social Structure                                                                             
    e.g. labor relations

One important step in creating this portal is the design and programming to upgrade the Dataverse Network’s architecture in order to allow for the more flexible and detailed handling of space, time, and topics required by the WHD.

Click here to listen to the October 2011 talk
(.mp3 format [to save the file, right click on the link and "save link as"])

Report on September Monthly Workshop at Pitt

Tuesday, 6 September 2011
3:00 – 5:00 p.m.
Present: D. Bain (geology), D. Holstein (history), P. Manning (history), W. van Panhuis (public health).

(1) Bain and van Panhuis on health and environment.
Dan presented the results of a month of collaboration with Wilbert. They collected and displayed data on six U.S. cities from Boston to Washington. The graphics shown here are first-draft looks at the combination of daily and annual variations in measles incidence and levels of precipitation for Boston.

Bain and van Panhuis 1 Bain and van Panhuis 2

This is not the place to present details: suffice it to say that the data, the methods, and the possibilities for extending this multidisciplinary analysis are quite exciting. There was discussion of possible restructuring of the investigation, in search of stronger and more interesting relationships: cities with greater variation in disease patterns and weather seasonality; analyzing low temperatures rather than precipitation. It was agreed that there will be an effort at the next stage to collect and include demographic data, especially by getting urban census samples through IPUMS at Minnesota.

(2) Manning on historical datasets online.
Pat listed categories of sites (by sponsors and by discipline), compared metadata required in the Dataverse Network with that in other online data sources, and illustrated the range of precision and detail in online historical datasets. The presentation emphasized the Inter-university Consortium on Political and Social Research (ICPSR) as a major collection of datasets on which we will need to draw, but which are not at all systematized. The conclusion emphasized the distinctiveness of historical data as compared with other online data. This preliminary PowerPoint is to be revised until it is ready to post on the World-Historical Dataverse website as a useful commentary on online historical datasets.

(3) Theory.
Question for the month: how does demographic analysis fit into social science more broadly?
We observed that demographic data, in census form, are recorded infrequently and don’t give much variance for analysis. Partly that is because demography changes relatively slowly, and partly we are simply missing short term fluctuations, especially through migration.

Click here to listen to the September 2011 talk
(.mp3 format [to save the file, right click on the link and "save link as"])

Report on August 2011 Workshop

Tuesday, 2 August 2011, 3:00 – 5:00 p.m., 207 Parran
Present: D. Bain (geology), D. Burke (public health), N. Condee (global studies), D. Doebler (public health), P. Manning (history), C. Sanchez (Arts & Sciences), W. van Panhuis (public health)

"Tycho" – a public health dataset, presented by Wilbert van Panhuis (Asst. Prof. of Public Health). An early presentation of this major advance.

This is research supported by the Gates Foundation Vaccine Modeling Initiative. It collects and analyzes weekly data on cases of "notifiable diseases" at U.S. state, city and county levels, reported in the U.S. from 1892 to the present. Data were retrieved from multiple resources (e.g. hard copy, online but non-downloadable resources; included manual digitization of PDF files). Wilberrt gave detailed discussion of the transformation and cleaning of data.

Visualization – three forms of presentation revealed remarkable patterns in disease incidence: 1) an imaginative circular diagram showed the changing principal diseases for a century; 2) weekly time-series showed annual fluctuations in cases, notably for measles; 3) graphic of time vs. region (state) vs. density of disease revealed complex patterns.

Discussion following the presentation led to a plan for immediate research and a workshop in Feb. or March with preliminary results of links among U.S. data on disease surveillance, hydrology, and population. We will inquire whether IPUMS (U. of Minnesota) is willing to collaborate on demographic data.

Click here to listen to the August 2011 talk
(.mp3 format [to save the file, right click on the link and "save link as"])

Report on July 2011 Workshop

Tuesday, 6 July 2011, 3:00 – 5:00 p.m., 3703 Posvar.
Present: D. Bain (geology), D. Berkowitz (economics), P. Manning (history), B. Ural (political science).

In this introductory session, participants described for each other their research, its objectives, and methods. Speaking to each other across disciplinary lines is good practice. We found parallels between Berkowitz's work on 150 years of elites in US states and Bain's work on soil movement and erosion in the US over 400 years. We found contrasts between identifying "instrumental variables" to establish causation in economic and political science and analyzing in terms of feedback in geology. The Bowker reading showed us that there are "data holders" in geology as well as in social sciences.

Click here to listen to the July 2011 talk
(.mp3 format [to save the file, right click on the link and "save link as"])

Dataverse Design Seminar, 2009 - 2010

Humphrey Southall, University of Portsmouth
"Visualising Britain through Time: Building and using an indefinitely-scalable library of individual statistical data values"
October 13, 2010, Information Sciences seminar room (IS 501); 3:00 – 4:30 pm

Patrick Manning, World History Center, University of Pittsburgh
"The World Historical Dataverse: Design and problem-solving for a large-scale, heterogenous, historical dataset"
April 22, 2010, IS 501, 3:30 - 5:00 p.m.

Ruth Mostern, University of California, Merced

"Mapping the Past: New Insights from Spatial History, with Examples from China and the Silk Road"

April 9, 2009,
3703 Wesley Posvar Hall, 4:00 – 5:30 p.m.

Patricia Seed, University of California, Irvine 

"Mapping West Africa: Defining Peoples, an Ocean, and a Continent" 

March 5, 2009,
3703 Wesley W. Posvar Hall, 4:00 – 5:30 p.m.