Are you wondering how knowledge silos manifest in data science teams? Or maybe you are more interested in learning strategies that can help prevent knowledge silos from appearing in data teams? Well either way, you are in the right place! In this article, we tell you everything you need to know about knowledge silos in data teams.
In the beginning of this article, we discuss what knowledge silos are and go over different ways they manifest in data teams. After that, we discuss some of the main pain points that knowledge silos can cause for data teams. Finally, we provide some examples of techniques and rituals that can be applied to help prevent knowledge silos from cropping up in data teams.
What are knowledge silos?
So what are knowledge silos? A knowledge silo occurs when there is a specific area that is only understood by one person or a small group of people in a given discipline. Generally, other people in the discipline do not have a sufficient understanding to be able to contribute to a project in that area.
Knowledge silos at the individual level and the team level
In this article, we will make a distinction between knowledge silos that exist at the team level and knowledge silos that exist at the individual level. The main difference here is the number of people that understand the area in question. Knowledge silos that exist at the team level occur when a small team or group of people has knowledge about a given area, but adjacent teams do not. Knowledge silos at the individual level occur when there is a single person that is the only person that has context about a certain area.
Throughout this article, we will focus more on knowledge silos that exist at the individual level than the team level. There are two reasons for this. First, knowledge silos that exist at the individual level generally present a larger range of issues than knowledge silos that exist at the team level. Second, data organizations tend to be smaller than other organizations like engineering organizations. As a result of this, data organizations are more likely to have a few people that are spread out across many areas. This means they are particularly vulnerable to knowledge silos at the individual level.
Knowledge silos at the domain, technology, and project level
We will also make a distinction between knowledge silos that exist at the domain, technology, and project level. The main distinction here is the type of area where the knowledge silo exists.
- Business domain. At the domain level, knowledge silos occur when there is only one person on the team that works on projects in a given business domain. For example, you might run into this issue if one person on the team works with marketing data, one works with finance data, and one works with advertising data. If each person only works with stakeholders in their respective domain and does not have much visibility into what is happening in other domains, then there are likely knowledge silos at play. Knowledge silos at the business domain level are particularly insidious because there is a large risk that crucial domain knowledge can be lost from the team. This can hamper any future projects that relate to that domain.
- Technology. At the technology level, knowledge silos occur when there is only one person on the team that is familiar with a particular tool or technology. While they are not ideal, knowledge silos at the technology level are generally not as insidious as knowledge silos at the domain level. Even if the only person on the team that understands a technology leaves, there are often external resources that can be used to educate other team members about that technology. Knowledge silos at the technology level are most harmful when they are related to internally developed tools that do not have external adoption.
- Project. In relation to a project, a knowledge silo can pop up when only one person has knowledge about a specific project. Project level silos can exist even when there are multiple people on the team who work in the same business domain or utilize the same technologies. While project level silos can certainly cause issues of their own, they are generally not as harmful as knowledge silos that exist at the business domain or technology level. This is because they only affect one specific project rather than an assortment of projects.
What problems do knowledge silos cause in data teams?
What problems do knowledge silos cause in data science teams? Here are some examples of problems that knowledge silos can cause in data teams.
- Large distributions when someone leaves the company. The main issue with having knowledge silos in data teams is that they can cause large disruptions when someone leaves the company, or even when someone decides to join a different team at the same company. If only one person has knowledge about a specific project or domain area, it will be difficult for someone else to jump in and take over the project when they leave. At best, someone will have to make an oversized time investment to learn about the project so that they can take over. At worst, the project will come grinding to a halt. Either way, there will likely be a large loss in knowledge as no one will understand how or why certain decisions were made.
- Difficulties when taking vacation time. For a team with knowledge silos, it is also hard for anyone to take a vacation or use sick time. If no one else on the team is knowledgeable about the projects a given person is working on, there will not be anyone who can step in as backup when that person is out. This means that collaborators from other disciplines will be blocked and progress on the project will slowe. When this happens, collaborators on other teams may become disgruntled or try to message the person when they are out of office.
- Repeated or overlapping work. Another issue that arises when there are knowledge silos on a data team is repeated or overlapping work. This repeated work often happens across different projects or domain areas. If someone does not have much visibility into what their teammates are working on, they might not realize that someone else is working on solving the same types of problems that they are. That means that each team member will work on solving the same problem in isolation, which will take twice the amount of time and effort.
- Lack of innovation. Another issue with knowledge silos is that they make it hard for teammates to bounce ideas off of each other and get feedback. This can stifle innovation as workers tend to come up with more creative solutions to problems when they are able to get another point of view and work collaboratively. This may result in a project that was actually feasible being abandoned because one person could not think of a solution that would be applicable.
- Incorrect use of data. Another issue that occurs when there are knowledge silos across individuals or teams is incorrect use of data. If different teams own different datasets and there is not clear documentation on exactly what each column in each table represents, one team may make an incorrect assumption about the data that is contained in a given column and use the data incorrectly.
- Longer time to solve incidents. If you work on a team that handles production code, knowledge silos can be particularly harmful. If there are critical incidents or bugs that need to be resolved immediately, you may wind up in a situation where there is only one person who has the knowledge needed to remedy the situation. When there is only one person who has the context necessary to address the issue, it will generally take longer to find the issue or implement a fix.
- Limited career growth or unequal distribution of opportunities. Finally, knowledge silos can stifle career growth or limit growth opportunities. This is most common when the silos exist at the domain or technology level rather than the project level. If a person is constrained to working in a specific domain or technology, there will not be as many options open to them when they want to learn a new skill. There may be certain domains that have higher impact projects or more difficult projects available, which can result in an unequal diminution of opportunity.
How to prevent knowledge silos in data teams
How do you prevent knowledge silos from cropping up in data science teams? Here are some examples of practices that can reduce the harmful effects of knowledge silos.
- Explicitly assign someone as backup for each project. One strategy that can be employed to fight against knowledge silos at the individual level is to make sure that there are always at least two people who are accountable for each project. Even if there is only one person who is actively working on a project, there should be a second person who is expected to stay up to date with the project. This will ensure that the primary person who is working on the project has someone else to bounce ideas off of.
- Encourage a culture of feedback. One of the most important things you can do to fight against knowledge silos at the individual level is encourage a culture of feedback within the team. The more frequently team members are looking at each other’s project proposals, design documents, table schemas, and code, the better they will understand their teammates’ work.
- Enforce code reviews. Code is perhaps that most important type of artifact that should be regularly reviewed and given feedback on. If code reviews are enforced, this makes sure that all important pieces of code are understood by multiple team members. If teammates are regularly reading each other’s code, it is more likely that they will be able to jump in and assist when needed.
- Encourage pair programming. Encouraging pair programming is another great way to fight against knowledge silos at the individual level. Just like code reviews, pair programming sessions ensure that teammates are taking time to read and understand each other’s code.
- Hold demos. Hosting demos where team members can share something they are currently working on can also help to solve problems that arise from knowledge silos at both the individual level and the team level. For example, demos can help to solve the problem of duplicate work by ensuring that team members have visibility on what kind of problems other team members are working on.
- Create robust documentation. Creating robust documentation is one of the best ways to ameliorate the effects of knowledge silos that exist at the team level. In addition to creating this documentation, it is also important that the documentation be stored in an easily accessible and discoverable location. Documentation is only useful if people know where to find it when they need it.
Best practices for data teams
Check out our article on data science best practices for all of our best recommendations on how to increase the efficacy of data science teams.