Proficiency in Spoken English Amongst Indigenous Australians in NT
This was an exercise in maximising data density, and an experiment in different forms of presentation of the same dataset. Both graphics below share the same dataset, which was interesting because it was real, multi-dimensional data from the 2006 Australian Census.
In technical terms the map provided an opportunity to:
- develop a GeoJSON map with more complicated boundaries than the simple Australian map that I’d created previously. The challenge was ensuring map areas remained enclosed as the number of polygons was reduced. A smaller file often led to bleeding of areas when the area was filled.
- learn how to zoom and scroll the map, which was surprisingly easy (if you’re willing to tolerate a few glitches)
- implement mouse-overs to show another dimension of data, even though it requires the user to interact with the graph to extract that data. There doesn’t seem to be a clean way to do mouse-overs as the SVG spec doesn’t offer z-positioning, so the method feels a bit dirty.
This scatterplot attempts to help answer the following question:
Where is the greatest opportunity for missionary work in NT indigenous Australian communities if we wanted to minimise the effect of language barriers?
The dataset only includes the indigenous community, those who said they are Aboriginal or Torres Strait Islanders and who were in the Northern Territory on census night. The dimensions are:
- percentage that consider themselves protestant christian
- percentage that consider they have “good” or “excellent” spoken english
- each point refers to a census Statistical Local Area (SLA). This was the smallest geographical subdivision in the 2006 census.
- the bubbles are coloured by the largest language group in the SLA. While this is another dimension on the graph, it adds little value as there is such (magnificent) diversity within those language groups. It was an example of how another dimension could be added to the graph.
- the bubble is scaled to show the indigenous population in the SLA.
The name of the SLA is shown, along with population, as a mouse-over, and a map (of dubious value) is shown when the bubble is clicked.
Apologies for the lack of labelling - I found axis labels quite hard in d3.
This graph took quite some time, and taught me about:
- text placement on a canvas
- the perils of using areas to represent a relative values (humans aren’t able to compare the size of areas very well)
- linking from canvas objects
- drawing on a canvas
I’m really pleased with this scatterplot as a proof of concept.
The map came from the ESRI Shapefiles of the Northern Territory Statistical Local Areas that were available as a part of the Australian 2006 Census. It was converted to GeoJSON format in the QGIS program, after reducing the complexity using the QGIS simplify geometries function with a tolerance of 0.005.
The numerical data was also obtained from 2006 Census data using the (now superceded) CDATA online tool focussing on dimensions of spoken english language, indigenous language and religion dimensions for the indigenous population in the Northern Territory. The dataset required extra processing as the CDATA pivot tables needed aggregation to give the appropriate level of granularity. The current tool for this type of analysis of Census data is called TableBuilder. It may provide a level of control that makes this post-processing uneccessary.
My Github repo has the Map and data files