Structured data refers to information organized in a pre-defined manner. It is used to train machine learning models across applications. However, more often than not, real-world data does not fall into this neat category and has file types that are not machine-readable, such as text, images, and videos. This latter form of information is known as unstructured data, and it makes up more than 80 percent of enterprise data and is growing every year.
Given unstructured data’s prevalence, extracting knowledge from this information type is necessary, albeit far more complicated for machine learning programs to use, and often results in identification gaps for the model. For Zhang, this is a critical research emphasis that he has received multiple recognitions for.
“Turning unstructured data into structured knowledge is of great importance to science, engineering, and business. For example, knowledge graphs can be used to accelerate scientific research and empower smartphone virtual assistants,” he said.
“Deep learning models are currently dominating for almost all knowledge extraction problems. However, they do not have the ability to say, ‘I don't know,’ when facing novel situations and can be unreliable in open-world settings.”
Zhang’s research addresses this challenge by developing new techniques that quantify uncertainty for deep learning models and essentially let them know when they are unable to recognize something and how to proceed. This research can be an important step to improve the robustness and effectiveness of existing knowledge extraction technology.
“It is great to be recognized by the Google Faculty Research Award. The support of this award will allow my group to continue making our contributions to solving some pressing problems in this area,” said Zhang.
Zhang has been previously recognized and awarded for his work in this field with the 2019 SIGKDD Doctoral Dissertation Runner-up Award and the 2015 ECML/PKDD Best Student Paper Runner-up Award.