If you’re into data engineering but you have had your head in the sand, been on a vacation for months without internet or have simply been too busy to notice then chances are you haven’t yet heard of ChatGPT.
But all that is unlikely and I’m sure you have heard about ChatGPT so in this post, I wanted to put forward my initial thoughts about this new technology which has the potential to change so many things and data engineering is no exception.
What is ChatGPT
ChatGPT is the new kid on the block in the Artificial Intelligence space. It recognises questions you ask it and it will return you answers. It has been trained on a lot of data.
You can have some long conversations with it, starting with an initial question and subsequent related questions it will typically have no issue answering and therein is the power of it. It is like talking to a human and if you even point out to it that it has made an error, it will quickly apologise and rectify its mistake.
How Can ChatGPT Help You as a Data Engineer?
Asking ChatGPT a question like “Write me a data access class in Python class using MySQL” and it will spit out answer in seconds including a nicely formatted code snippet which you can use in a pipeline as either a starting point for your class or as your data access class – wow! 🙂
Just look at this:
And this:
You can quickly realise the potential here – it will save lots of time. And that is the whole point. It is A.I designed specifically for automating tasks in seconds which would otherwise take a human a lot longer.
How Will ChatGPT Change Data Engineering?
It’s all about speed right? Time is money and data engineering is a skilled profession which demands high pay.
Companies are prepared to pay this as the return on their investment is typically well worth it but everyone involved would always want to get things done as quickly as possible.
The faster we can work and produce things (quality must remain high), then the more money can be made for the business and a greater return on their investment is realised for hiring a data engineer.
Having what is a highly skilled assistant there at your disposal that doesn’t get tired of you asking it to do work and doesn’t want any money in return for sharing it’s vast knowledge with you is something to take advantage of.
ChatGPT will add velocity to data engineering. It’s not that it will be used in all cases but to data engineers writing their code as opposed to using an off the shelf tool, it will add significant value.
Data engineers using ChatGPT should, if used correctly deliver projects faster if they take advantage of this fine piece of technology from OpenAI
Will ChatGPT negatively impact the Data Engineer?
Well, it is rather like copying and pasting someone’s work from some blog post or question posted on a tech form somewhere, all such activities need to done with care.
You may be thinking, wow – I can just have it write me pipelines and the job is done and I can take the rest of the day off but you are ultimately still responsible for what you produce for your business or client.
Simply taking everything it says as gospel, not checking what it has produced and then finding out later it contained bugs or some performance issue will ultimately be your downfall.
- You have to check its output
- You have to understand what it is doing
- You still have to learn
- You still have to work hard
- You are still responsible
Conclusion
Data Engineering has a new assistant which is there to help increase your output. Take advantage of it and realise its potential but always remember:
With great power, comes great responsibility.
Photo by Possessed Photography on Unsplash