If you had told Dash Wieland ten years ago that he was going to become a data scientist, he would have said, “A what?” It was only a few years ago that data science crystallized in the public understanding as a field in its own right, as phrases like “Big Data” and “machine learning” started popping up in tech journalism and the broader lexicon like dandelions invading a suburban lawn.
Dash started his career in marketing after earning a BA in psychology. Soon, hungry for more interesting projects, he began an MS in predictive analytics at Northwestern University, which he received after three years of remote study. Since then, he’s left marketing for the academic publishing industry and earned a second master’s degree, this time in spatial analysis for public health, from Johns Hopkins. Luckily for us, he didn’t join his classmates in studying coronaviruses, which meant he had a little free time to talk about his life as an analyst.
It’s interesting to me that you got a degree in psychology but went straight into marketing. How did that happen? Did you already have some hard skills that you brought with you into the job?
I wasn’t interested in counseling. What I studied was more psychometrics, quantitative psychology. I was part of the team that built a test that was basically looking at holistic student outcomes between students who had studied abroad and students who didn’t study abroad. And that included things like GPA, but also empathy and emotional resilience. I came out with fairly strong knowledge of how to build survey tools, how to build a measure, how to collect a sample, how to analyze that data. So marketing was an easy thing to get a job in, because a lot of small marketing businesses, in the US at least, are being asked for the first time ever to prove that what they’re doing works. And none of them are data people, you know, they’re all creatives. They’re all artists or writers. So I was able to get a job at a small, local marketing company, basically doing surveys for them. That was essentially the calculus.
Did that prepare you for more hardcore analysis?
Yeah, I think so. I think the good thing about working with survey data for analysts is that it emphasizes some of the — I don’t know how to say it — some of the more ephemeral aspects of data analysis. It’s easy to take a CSV file and then run a bunch of correlations or run some tests on whether there are differences in the means. It’s easy to apply all these different analytics tools because of how simple a lot of the tools that are available nowadays are. But if you’re not putting thought into what you’re doing, if you’re not careful about what you’re choosing, you can make mistakes without being aware of it. But with surveys, you’ll quickly realize that if you ask some question in a slightly different way, you’ll get completely different responses. And so it emphasizes the fragility of these analyses. And then, if you’re not really careful with how you’re setting up your questions, how you’re tracking the data, how you’re testing the data, you can start to get erroneous or conflicting results. So I would say that was the primary reason that it was helpful.
From a very practical standpoint, I also found shifting from a classroom setting to a real-world setting helpful. It’s good just to get practice at very basic data cleaning. So you’ve got some messy, short-answer data, what can you work with? Can you work with strings to clean up that data? Or can you do very basic text analysis? Things like that, just actually getting your hands on real-world data, which is inevitably much, much messier than anything you’ve seen in a classroom. That was good as well.
You're currently working at a preprint platform, Research Square. What do you do there?
Some of it’s just very basic business stuff. How we’re doing financially, making a projection for revenue for next quarter. Take a look at how many people took vacations last year and try to project how many people will take vacations next month, just very typical generic business work. But some of it is focused more on our interest in whether preprints work in the scientific field. [Preprints are manuscripts of academic papers that have yet to be peer-reviewed.] For instance, I took part in an analysis that looked at the length of scientific publications pre- and post-COVID. We took this big dataset and counted up all the words in each manuscript, and basically found that since the pandemic, scientific publications, especially those that are focused on the disease, have gotten much, much shorter. And so that’s work that we’ll publish and we’ll be able to share and talk about the ramifications that has for the industry. I’d say my job is about a 50/50 blend of business work and research into how the way that scientists share their work is changing.
What do you like about this kind of work in general?
I really like the puzzle solving. I’ve been playing video games since I was a little kid, and there was always a spark of joy in my head when I would solve the puzzle — you know, get through the maze, reach the objective — and with analysis work that happens all the time. You do a little query, and you get those numbers and the numbers line up and you can get the spark of joy. It’s exciting to solve a problem.
I really like the freedom that I have. It’s very self directed. That varies, not all data analyst jobs are that way, but in the ones I’ve had, I’ve been the only analyst or one of a very small number of analysts. There isn’t necessarily somebody who’s directing every action that you take. It’s more like, “Here are our problems. Can you help us solve these?” I really like that setup. Having a laundry list of things to do is less interesting than having a problem to solve.
I really like the remote work [Dash lives in Indiana but works for a company in North Carolina], although again, that’s not ubiquitous in data analysis, but it’s more common than for other professions.
One other thing that’s kind of nice about the field is that it’s relatively easy to get hired as a freelancer. That’s because people often don’t have a clear understanding of what an analyst would do at their organization, but they might have a specific question that they’re interested in studying. It’s easy to get plugged into projects like that. And once you develop a relationship, you can start to develop more projects.
What about dislikes?
Basically every analyst, at least in my experience, is going to have at least some portion of generic business work. And that’s not particularly interesting data to most people, you know, looking at vacation times for your employees or helping the HR department. I mean, it’s fine, and in some ways it almost doesn’t matter what the data is, it’s still a puzzle. But that’s less interesting work than some of the more speculative or imaginative work, some of the higher-level or more technical work.
I would say personally I’ve been able to avoid this, but I’ve seen other people in my positions struggle with public speaking. Especially if you’re in a small department, if you’re not, you know, on a data analysis team of, 15 to 30 people, if it’s just you and a couple other people, you’re going to have to present. And if you can’t present well, then your ideas will be ignored. If somebody came to me and said, “I want to be a data analyst, but I only want to do the numbers, I never want to present,” I would tell them to find a big company where they can work under a manager who’s also an analyst and will protect them from having to present, will present their work for them.
I’ve also noticed that older analysts that I’ve worked with are frustrated by having to learn how to code. In both of my grad programs, I saw slightly older people, people in their 30s or 40s or 50s, really struggle with programming. There’s definitely some stress around learning languages like Python or R.
Given your expertise, do you find yourself gritting your teeth when you watch the news?
(laughs) It’s really bad. I don’t know if you’re a fan, I’m sorry if you are, but one of the worst for me is Elon Musk. Any time he talks about AI, smoke comes out of my ears. It’s very obvious when a person doesn’t know what they’re talking about.
I think my favorite is when people say AI — not predictive AI, but, like, Skynet AI — is right around the corner. It’s coming tomorrow. Look, I’ve worked on some of these things. I don’t think it’s coming.
You’ll absolutely be able to predict, say, the risk of a customer defaulting on a loan by using a machine learning model. But let’s pretend that each customer has a numeric ID and that those IDs are (for whatever reason) correlated with defaulting on a loan. The machine learning model will happily utilize that correlation to predict defaults, despite the fact that customer IDs don’t actually carry useful information related to defaulting on loans. The model can’t reason about the variables it’s given, or think creatively about what other variables should be included.
When you start to think of the reasoning humans can do, even with seemingly unanswerable questions like “What is art?”, it becomes very clear that we don’t have a framework for creating a truly intelligent artificial mind. Are there models that make art? Sure. Do those models have a conception of what art is as a philosophical concept? No. In my humble opinion, the gap between human and artificial intelligence is going to be difficult to close.
So yeah, it’s frustrating, but it’s more funny than anything else to hear people’s understanding of AI. I’ll say, “Hell yeah! I work in data science, I do some machine learning,” and they’re like, “So you’re building Skynet? ”Well, I spent six weeks getting a neural network to be able to recognize handwritten digits, and even then it wasn’t very good. I’m not claiming to be the best, I’m just saying that it’s a lot further away than people think.
Learn more about the data science bootcamp and other courses by visiting TripleTen and signing up for your free introductory class.