• Why I’m struggling with learning Azure Synapse

    So, for 2023 I’ve decided that I want to learn Azure Synapse. I want to be able to make training content on it by the end of the year. I’d like to be able to consult on it in two years. And right now, I am absolutely banging my head against the learning curve. Let’s talk about why.

    The integration problem

    Occasionally, I’ll describe Power BI as “3 raccoons in a trench coat: PowerQuery, DAX, and visuals”. What I mean by that is it is 3 separate products masquerading as a single, perfectly cohesive  product. Each of those pieces started out as separate Excel add-ins, and then were later combined into a single product. And it shows.

    The team at Microsoft have done a great job of smoothing out the rough edges, but you still occasionally run into situations where the integration isn’t perfect. A simple example is where should I create my date tables in Power BI? Should I use M or DAX? The answer is either! Both of them have good tooling for it. Because these tools evolved separately, there’s going to be some overlap and there’s going to be some gaps.

    Azure in general (and Synapse in particular) has this problem. If Power BI is 3 raccoons in a trench coat, Synapse is 10 of them wobbling from side to side. The power of the cloud is that Microsoft can quickly iterate and provide targeted tooling for specific needs. If a tool is unpopular or unsuccessful, like Azure Data Catalog, Microsoft can build a replacement, like Azure Purview.

    But this makes learning difficult. Gone are the days of a monolithic SQL Server product where, in theory, all of the parts (SSRS/SSIS/SSAS) are designed to fit cohesively into a single product. Instead, Microsoft and us data professionals must provide the glue after the fact, after these products have evolved and taken shape. Unfortunately, this means understanding not only how these pieces fit together but when in practice they don’t.

    This is the curse of the modern cloud professional. We are all generalists now.

    The alternatives problem

    The other big problem is just like the issue with M and DAX, there are multiple tools available to do the same job. And while M and DAX compete on the borders or on the joints, Azure Synapse has tools that are direct competitors. The most prominent example is the querying engines.

    From what I understand, Azure Synapse has 3 main ways to access and process data: dedicated SQL pools , dedicated Spark pools, and SQL Serverless. Imagine if I told you that you had 3 ways to cut things: a scalpel, a butter knife, and a wood saw. These all cut things, it’s true. But then imagine if I immediately dived into what type of metal we use for our butter knives, that our saws have 60 teeth on them, etc.

    It would be a little disorienting. It would be a little frustrating.

    You might wonder how we ended up with 3 different tools that do similar things. You might wonder when you should use which. You might wonder when you shouldn’t use one of them especially. Giving your learners the general shape and parameters of a tool is a big deal.

    Imagine if a course on Azure ButterKnife™ instead started with “This is Azure ButterKnife™, it is ideal for cutting food especially soft food. It shouldn’t be used on anything harder than a crispy piece of toast. It originally started as a way to spread butter on toast.” It would take 20 seconds to orient the learner, and if they were looking for a way to cut lumber, they could quickly move on.

    The expertise problem

    When I was doing a course on ksqlDB for Kafka, I ran into a particular problem. Because ksqlDB was a thin layer of SQL on top of a well-known Kafka infrastructure, so much of the content assumed you were experienced and entrenched in the Kafka ecosystem. It quickly covered terms and ideas that made sense in that world, but no sense if you were coming from the relational database world.

    And a thing I would keep asking, to no one in particular, was “How did we end up here?”. What was the pain point that caused people to create an event stream technology and then put a SQL querying language on top instead of just using a relational database. I talk about this more on a podcast episode with the company that made ksqlDB.

    Azure Synapse has a similar problem. It is an iteration on various technologies over the past decade. And it’s designed to support large datasets (multi-terabyte) and complex enterprise scenarios. And so a lot of the content out there assumes a certain level of expertise, in part because the people interested in it and the people training on it are both experts.

    The challenge this presents is twofold. First, the more of an expert you are, the harder it is to empathize with a new learner. Often the best teacher is someone who learned a technology a year ago, and remembers all the stumbling blocks. This is a challenge I struggle with regularly myself.

    The other issue is that the content often pre-supposes the learner knows what the foundational technologies are and why they are important. It might assume the learner Knows what delta lake is, and what parquet is, and um, why are we storing all our data in flat files to begin with???

    That’s not to say that every course needs to be a 9 hour foundations course. But there are ways to briefly remind the viewer why something is important, what pain point it solves, and why they should care. And if they are totally new, this helps orient them quickly.

    For example, a course could say “Here we are using the delta lake approach. This allows us to enhance the efficient column storage of parquet files with ACID compliance that we usually lose out on when using a data lake.” This explains to new learners why we are here and reminds seasoned learners why they should care. This can be done quickly and deftly, without feeling like you are talking down to experienced learners.

    So now what?

    I’m hoping this will help folks who make content in this area. If nothing else, I hope it will be a reminder to me a year from now, when I’ve forgotten what a pain this was. In the next blog post, I’ll write about the instructional design techniques people can use to get around these issues.

  • Lessons learned from being self-employed: 4 years in

    2022 was my best year financially and probably my worst year personally. This was the year that we achieved financial independence. We had 6-12 months of expenses in the bank and the royalties were covering our living expenses. It was also the year that I found that a relaxing weekend off wasn’t enough any more, that bouncing back wasn’t working anymore.

    Too much, all at once

    2021 was very quiet year as far as my consulting was concerned, about 10% of my revenue was consulting and the other 90% was royalties and completion payments. In 2022 that changed, however.

    There was about 3 months where I was billing 20-30 hours per week on top of signing up for a new course near the end of it. Because the royalties were covering my expenses, this all went right into savings and we ended up with 6-12 months of savings. This is a consultant’s dream.

    Unfortunately, life was occurring at the same time.

    My husband had an elective surgery that we were planning for months. It went great and he’s completely recovered. What this meant, though, is that I was tasked as nurse for 2 weeks and janitor for 3 months. I was suddenly doing all the chores that I had taken for granted, while also working 40-50 hours per week.

    Near the end of this my mom started having issues as well. The isolation of Covid was finally taking it’s toll and she was having more issues. She was clearly lonely and bored and only really got out of her apartment every other week.

    This also has been largely resolved, but for a while I was bribing myself with Magic the Gathering boosters to call her every day and check up on her. We’ve increased the services that she’s receiving, and she gets out twice a week now, but during the summer it was a really challenging time.

    When your body stops working

    I think many would describe what I went through as burnout. I’m not sure of the right term, but stuff just stopped working. More coffee didn’t help. I would schedule a weekend to catch up on a course and get nothing done. I would take a few extra days off, to no lasting effect.

    Something broke.

    Realizing I needed something more, I schedule 2 weeks off at the end of the year. As a consultant it’s difficult to take time off unless you plan it far in advance. It’s even more difficult if you feel like you are always behind on projects. I only made 2 courses this year and the second one was 3 months late, horrifically overdue.

    I’m one week in and I think this was 100% the right choice, I needed a deeper rest to catch up from the last 3 years.

    A gut punch from Pluralsight

    A couple of years ago, Pluralsight was purchased by private equity. I was cautiously optimistic at the time that this might enable them to get away from the quarterly cycle of the stock market. The results were mixed, with them making a very large acquisition of A Cloud Guru, which is still resolving.

    But in December this year, the company had 20% layoffs essentially firing 400 employees. There were also changes for authors, and while I can’t get into the details, I’m expecting my royalties to go down 25%. This will put me below sustainability, with royalties no longer covering 100% of my living expenses.

    So now what

    For now I’ve been focusing on enjoying my vacation, recovering from 2022, and not worrying about the short term. I’ve also been reaching out to colleagues and peers, asking for advice.

    I no longer see PS as a sustainable career, which means looking into doing more consulting or selling my content elsewhere. I could also get a regular W-2 job, but I would lose much of the flexibility that helps me take care of my mom.

    In the end, I think I’ll be fine. But I have no idea what I’ll be doing for a living by the end of 2023.