What Data Analysts Can Learn from Software Engineers

what data analysts can learn from data engineers

Data analysts and software engineers solve problems differently, but borrowing engineering best practices can make you a sharper, more efficient analyst. Here’s how.

Data analysts and software engineers might seem like different breeds—one focuses on insights, the other on building systems. But in this era of “Data” that we are witnessing, the best analysts are stealing pages from the engineering playbook.  

Why? Because modern data work isn’t just about SQL and dashboards. It’s about scalability, reproducibility, and collaboration – areas where engineers have decades of wisdom to share.  

Adopt Version Control (Git for Data Pros)

 The Problem: “Final_Final_Report_v3.xlsx”

Most analysts rely on ad-hoc file naming or (worse) overwriting files. Engineers use Git, and you should too.  

How to Start:  

Track SQL scripts and notebooks in GitHub/GitLab (even solo projects).  

Use descriptive commits (e.g., “Fixed revenue calculation logic”).  

Try tools like DBT (built for version-controlled data transforms).  

Example: [How Airbnb’s data team uses Git](https://medium.com/airbnb-engineering/using-git-for-data-science-at-airbnb-f082afff57f8).  

Write SQL Like Code (Clean, Modular, Documented)

Bad Habits to Break:

  • Monolithic queries (500-line SQL with no CTEs).  
  • Hard-coded values (use variables or config files).  
  • Zero comments (“Why did I filter out these users?”).  

Engineer-Approved Fixes:

  • Use CTEs (WITH clauses) for readability.  
  • Parameterize queries (e.g., with Jinja in DBT).  
  • Add a –purpose header to every script.  
```
 sql
-- Purpose: Calculate monthly retention (cohort analysis)  
-- Last updated: 2024-06-15 by [Your Name]  
WITH user_cohorts AS (  
  SELECT  
    user_id,  
    DATE_TRUNC('month', signup_date) AS cohort_month  
  FROM users  
)  
-- Rest of query...  
```

Test Your Data Like Code

Engineers write unit tests. Analysts should too:  

Common Data Issues to Catch Early:

  • Null values where they shouldn’t be.  
  • Duplicate records skewing metrics.  
  • Boundary cases (e.g., negative revenue).  

Tools to Borrow:

  • DBT tests (validate uniqueness, referential integrity).  
  • Great Expectations (Python-based data testing).  
  • Simple asserts (e.g., `WHERE revenue > 0`).  

4. Borrow These Engineering Mindset Shifts

  • Automate the Boring Stuff
  • Scheduled refreshes > manual report runs.  
  • Self-service dashboards > ad-hoc requests.  

Design for Debugging

  • Log your steps (e.g., “Applied filter: active_users_only”).  
  • Assume someone else will inherit your mess. 
  • Optimize for Maintenance, Not Just Speed.

A 10% slower query that’s readable > a “clever” one-liner nobody understands.  

Conclusion 

Data analysis is evolving—and the line between “analyst” and “data engineer” is blurring. By adopting software engineering best practices, you’ll:  

  • Spend less time firefighting (debugging, recreating work).  
  • Become a better collaborator (clean code = happy teammates).  
  • Future-proof your skills as AI automates routine tasks.