Coding assistant tools are popping up everywhere nowadays (Copilot, JB Assistant, ChatGPT, etc.). They ensure fast and easy coding for developers and also provide a variety of ways to solve the same problem. Normally, the developer can choose preprocessed snippets for a line or lines of their code, along with some suggestions for autocomplete.

This seems like a wonderful tool with unimaginable possibilities, and it kindly is, but this is not entirely true.

As an AI project, assistants must have some margin of error to be able to predict text and give recommendations. In doing so, this leaves us with possibly incorrect code or code that is not efficient in computer terms, and here is why a beginner may fall for a flawed code.

According to Nguyen and Nadi in research about GitHub Copilot, they said:

“However, as pointed out by GitHub, Copilot’s suggestions may not always work or even make sense. Thus, it is important to assess the correctness and quality of Copilot’s suggestions to provide better insights into the overall performance of the tool.”

For a better understanding, in this context, a Junior is a person starting their coding journey, and a Senior is someone who's been a coder for a while. Let's go through the basics first.

How Is an Assistant Built?

The concept of deep learning is growing popular this decade, and it's basically creating a program that will learn by trial and error, “saving” the best path to the most relevant answer. This concept is based on human brain interactions, and like a human, it will make mistakes.

Let's illustrate a deep learning model:

[Image Source: O'Reilly]

Deciphering the schema above in nontechnical language: The model receives an unimaginable amount of files with a variety of codes stored on the internet that were written by programmers. They're probably tagged by programming languages (C#, C++, Java, Python, etc.). And the comments of these codes can be in any human language (Arabic, English, Italian, etc.).

Then it will be translated into computer language (a mathematical model called a tensor) that will be turned into a list of numbers representing a “semantic proximity of the word.” [For a more accurate and complex explanation of tensors, here is a good text about it.]

The model will run this preprocessed data and make a lot of random statistical predictions to try for the most accurate answer. It will save these best answers and acknowledge the best features for a more accurate prediction. In the end, the model will find the best feature selection for the most accurate answer, and this will be saved as the “final” model.

After this procedure, the model gives an output with the combined tensors that will be expected for a given question and translates it into human language.

Why Are There Errors?

In any AI model, there is a margin for error.

There is a concept called overfitting and underfitting. Overfitting is when an AI learns 100% of the training data. This is awful because the model will not be able to predict anything that is not stored in the training data. Therefore, if you give this model new data that was not learned by it, the model will break and will not give any answer or a correct one.

Conversely, underfitting is when a model lacks data for training and, therefore, will not be able to get enough information for a strong prediction. It will give random answers, and the user will never get a correct or repeated answer.

Based on these statements, an AI model can't be 100% accurate to give a satisfactory answer.

This will give the user mostly good and correct answers but will sometimes provide wrong or not well-structured answers.

Junior Is Not Worthy

Despite the joke of the title given to this headline, beginning to code can be quite a challenge, even more so for someone who is not familiar with programming logic. For that reason, a code assistant can be misleading and give incorrect and poorly performing codes.

Someone who's starting their career should endure the challenge of coding alone with the mentoring of a senior developer and try to find their own path toward the problem-solving challenges presented to them. This person is not trained enough to recognize wrong code and bad computational costs. This will help build a code pattern that will follow the developer for the rest of their career and shape the form of their approach to new challenges.

If they start using an Assistant without some maturity, they will learn the Assistant's code pattern, which was made by a miscellaneous amount of people. This can extend the learning curve for an easier understanding of their code. On the other hand, it can give misleading answers to the user’s questions.

Assistants are programmed to answer questions, and despite the precision of that answer, they will show you something. Imagine the programmer wants to get a column of a database, iterate through this data to find a specific pattern and do a complex calculation on it. The Assistant will provide some answers, and it can be almost right. But the pattern that it got can be inverted, and when the programmer runs the code, everything will go smoothly, but in the end, the calculation result was all wrong. Or the iteration method suggested by the assistant gives an immense computational cost, taking the code hours to run.

Assistants Are Not the Enemy

In conclusion, code assistants are very useful and accelerate the coding process. However, for beginners, it is something that we must discuss whether it's good to allow them access. For someone very senior in one or many programming languages, the Assistant will not intervene even if a wrong suggestion is given. However, in the case of someone whose coding abilities are not solid, the Assistant will be the “coach” of this professional, and if the coach doesn't know exactly what he's doing, then we will have a bad future professional coder.


Author

Filipe Castro

I've always been passionate about research and development, which led me to become a Data Scientist. What caught my attention the most is how "magical" the concept of what AI can or cannot do is. Therefore, I enjoy sharing my knowledge to demystify the concept of all-knowing and movie-like AI.


Asymmetric Cryptography

READ MORE

Improving the developer experience when documenting

READ MORE

Terraform 101: An Introduction to Infrastructure as Code

READ MORE