Home

Pinned

Recent

December 2023

Completed 3/8 courses of Google's Cybersecurity Professional Certificate.

April/May 2023

I am now addicted to Power BI.  For my software engineering class, I am working as part of a group to create a web app for handling online donations.  I seemed to be the only one who cared about separating the data into fact and dimension tables.  I took the liberty of generating mock data for our project, and once it was there I could not help but bring it into Power BI.

It was nice to have tips for my group about how to organize the data.  Once we had the tables set up, I liked being able to visualize for them how the tables would interact in Power BI relationship view.  I really appreciate the simplicity of our little database in comparison to the "messy spaghetti" look that the large data models at work take on.  I certainly wish we could use a Power BI dashboard to visualize data on the site.

March 2023

I have continued to become more comfortable with the database structure at Ortho Molecular Products.  My understanding of the measures and dimension tables are at the point where I can determine if the report I am building requires additional calculated measure during the planning phase.  I am comfortable writing my own measures, and have gotten significantly more reliable working within "best practices" framework.

I have spent a significant amount of time working within Power BI Paginated Report Builder.  These reports are sent out via weekly subscriptions, and all components must update automatically, not be "hard coded".  This experience has greatly impacted how I think about reports or measures when designing them, and has led my work to utilize concrete and well tested logic that requires no scheduled maintenance.  I utilize Matrix visuals to allow for changes within Sales Territories, active Salespeople within those territories, and rolling changes in tracked products, without breaking the format of the report.  The scope of the timeline within these paginated reports defaults to the current week of the current year, and is an adjustable parameter that the user can change to generate historic data.

January 2023

This year I've been given the privilege to work as a Data Engineer on Ortho Molecular Product's Business Intelligence Team.  Although I have worked relatively extensively with databases and SQL through my schooling, this was the first time I have gotten to work with "real world" data.

Having taken "Server Side Scripting" and "Web Server and Unix Administration" this past fall 2022, the basics and fundamental building blocks of databases and SQL were fresh in my mind.  The tables worked with in the classes consisted of 3 or 4 columns and maybe a dozen rows.  In the real world, the dimension tables I work with have a around a dozen columns and are millions of rows long.

This stark difference has resulted in a rapid evolution of my approaches to working with data.  I have to take into account other factors such as data refresh rate and learned how to use scripts or macros to filter out or correct data formatting errors.  In the real world, it is extremely inefficient or sometimes impossible to manually find discrepancies in dimension tables with millions of entries.  Additionally, real time refreshing of user accessed data sets is generally impractical.  This is especially important to consider when the data is stored at a remote location, as is the case with my employer.

The report in this screenshot has been modified to use generated mock data.  The real report has many more reports which is the reason for such a large table on the left side.  There is also significantly more fluctuation within the real data, which allows for better visualization of trends and identification of underutilized or unutilized reports.
In the real world, yearly reports would likely not be viewed as frequently as their weekly counterparts.  In the future, I will likely modify my data generating code to take factors like this into account.
Building reports like this makes up a small but important fraction of what I do, and it is the easiest responsibility to visualize on this website.

Something entirely new to me this year was visualizing large amounts of data.  While working with the small datasets in my classes, data visualization consisted entirely of outputting the data to a table on a webpage or user terminal.  This is simply not possible when viewing large amounts of data.  In  my position, I have learned how to use tools such as Microsoft Power BI to visualize data in a meaningful way.

I created the Power BI report above, as a way to track and visualize usage of other reports accessible to employees across the company.  The goal of this project is to give those in the Business Intelligence team a way to identify and reduce unused or sparsely accessed reports.  Simply measuring the usage of each report is insufficient, as it is possible for the only viewer to be the developer of the report maintaining it.

Not only does the above report visualize the report usage in multiple ways, but it gives the user a way to easily filter the report to display the most relevant information to them.  As seen in the screenshot, selecting a row on the table adjusts the visuals to only display usage details from that report.  This dual usage allows broad usage trends across all data to be visualized on the same page as the filtered or specific reports.

As this was one of my first reports, an objective of the project was to practice visually formatting for aesthetic reasons.  Just about every possible way to modify the report visually (colors, text, rounded visual corners)