Pythian Blog: Technical Track

Analyzing Open.Canada.Ca datasets with Power BI

My friend Chris Jaja recently posted on LinkedIn about the awesome work his team has been doing with the Open Government Portal at the Treasury Board of Canada Secretariat. This is a great initiative and it gave me another excuse to fire up one of my favorite data visualization tools: Microsoft Power BI. Let's check it out! Open.Canada.Ca The Open Government Initiative's goal is to make government more accessible for everyone. This is done by opening up data, information and facilitation communication directly to the government. What is available:
  • Open datasets from all sorts of Canadian government organizations.
  • Government digital records, information requests and contracts.
  • An API to access all this information programmatically.
  • A communication portal to engage directly with the government.
For my purposes, the datasets and the digital records are what I see as the main goodies that I can use to learn, practice and share the skills of business intelligence with our customers. Even if you are not Canadian, this is still a great resource! There are countless datasets and digital records here that you can use for many purposes. Whether you want to build a demo solution on an SQL database, practice report creation in Power BI or Tableau or test a machine-learning model, there is abundant, well documented and free to use information here. Why Power BI? If you follow my blog and presentations, I'm a big fan of Power BI. And here, it is a great match to consume, model and visualize the datasets from the Open Canada portal. There are a few reasons for this:
  • Cost: free account that is more than useful.
  • Learning Curve: Power BI is one of the easiest visualization tools to pick up.
  • Flexibility: consumes most data formats, includes data transformation and modelling capabilities.
An example Using the search capability of Open.Canada I simply looked for "fertility rate". Immediately I got all the fertility rate datasets that are maintained by Statistics Canada. I was able to go into it, modify what fields I wanted and easily download them as a CSV file. With Power BI Desktop, I was able to clean up the column names so they were human-friendly and also re-format the field values for the geography and the age categories. Then I created a simple report to browse through this fertility rate dataset graphically. Finally, to be able to share it with all of you, I published to Power BI.com. All of this with my free account. In this report, we have the fertility rates for each Canadian province, categorized by age brackets for every year from 2000 to 2016. Even with such a simple dataset, we can see some interesting information, such as the changes over the last 16 years in the leading age category in which women are giving birth. This clearly marks a shift in our society to forming families and having children at later stages of our lives. It is also interesting to see trends over the years for provinces and how the more rural, remote areas have higher fertility rates than the more populated urban ones. The visual below is 100% interactive (and Power BI keeps it free to publish or embed on a public blog). On the bottom right corner you can enter into full screen mode to better see and manipulate it. What's next? Based on Chris' post, his team is continuing to work on the portal to add better search capabilities, improve data quality, add AI, more APIs, etc. I'm very excited to have this treasure trove of data available now for everybody and can't wait to use more of it for my own learning, research and presentations. Hopefully you will find it just as useful, cheers!

No Comments Yet

Let us know what you think

Subscribe by email