Welcome to insideBIGDATA’s “Heard on the Street” round-up column! In this regular feature, we highlight thought-leadership commentaries from members of the big data ecosystem. Each edition covers the trends of the day with compelling perspectives that can provide important insights to give you a competitive advantage in the marketplace. We invite submissions with a focus on our favored technology topics areas: big data, data science, machine learning, AI, and deep learning. Enjoy!
“How Do We Bring Light to Dark Data?” Commentary by: Dale Lutz, Co-Founder & Co-CEO of Safe Software
“The world has never collected more data than right now, with 90% of all data created in the last two years alone. Given the sheer volume of data being collected by billions of people every second, it’s becoming overwhelming for organizations to manage and make the most of it. Data organizations are collecting but not using is called ‘dark data’ – and makes up the majority of data enterprises collect. Dark data has the potential to be transformed into incredibly useful information for enterprises. As we gain a deeper understanding of AI, we could be on the precipice of an exciting new frontier in the data economy. For example, emerging technology could filter and/or aggregate huge data volumes to provide value through more actionable and analyzable datasets. It could further have the potential to find patterns in dark data that would typically be ignored by organizations. For enterprises, this could include finding new markets, identifying outliers that foretell important risks or opportunities, assessing equipment failure potential, targeting potential customers, and/or preparing training data for machine learning and artificial intelligence use. Modern integration approaches can further extend the utility of otherwise dark data by joining it to other datasets, with a result that the whole is far more valuable than the sum of the parts. It’s an exciting time in the data industry as new technologies like AI and modern data integration approaches hold the potential to shine light onto the underused and undervalued underside of the data iceberg.”
“Data Science, Discipline, and Decision-Making: The Three D’s of Investment Management.” Commentary by Paul Fahey, Head of Investment Data Science, Asset Servicing, Americas at Northern Trust, and Greg McCall, President and Co-Founder of Equity Data Science (EDS)
“Performance data for investment managers is available for their clients, and even the public, to see on a daily, quarterly, or longer-term basis. When it comes to making investment decisions, however, asset managers are looking to data sources that are not so easy to find, to generate insights and gain an edge on the competition. A 2023 survey of 150 global asset managers by Northern Trust and Coalition Greenwich found that managers are focusing on more quantifiable/disciplined investment processes as a key avenue to achieving alpha. In line with this approach comes a focus on data management, as many investment teams face challenges effectively managing their data to make better decisions. While Excel spreadsheets and Word documents have been go-to tools for decades (and still need to play a continued role), they lack advancements in workflow integration, analytics, and data management. As a result, investment teams often resort to fragmented workflows and storing critical intelligence in various systems such as Google Drive, Outlook, or Evernote. This decentralization leads to inefficiencies, increased risk, limited collaboration, and missed opportunities. This is where data science comes into play. Investment data science allows the consumption of large data sets from multiple providers and sources through cloud-based technology and enables investment teams to interrogate their data to gain meaningful insights. This can go beyond number-crunching of market and reference data to incorporate the manager’s proprietary data around investment process management – trading patterns, analyst research, buy-and-sell discipline, macro strategy, and other information often stored in siloed or disparate locations. While the computing power needed for data science has historically been available only to the largest asset managers, new cloud-based tools are democratizing the application of data science to the investment process for a broader audience. Small and mid-sized asset managers now have cost-effective access to deeper analytics, ensuring they can compete on investment expertise and not on the ability to invest heavily in technology. These platforms can enhance the investment process by bringing data into a central ecosystem, allowing for greater collaboration and accountability. From pre-trade to post-trade, data science can unlock insights into a manager’s decision-making, providing a holistic view of their processes. With each insight, managers can develop a more quantifiable, disciplined investment approach, giving them an edge in the ongoing battle for alpha.”
“Why the ‘AI arms race’ doesn’t exist.” Commentary by Tal Shaked, Chief Machine Learning Fellow at Moloco
“The ‘AI arms race’ is all over the headlines in reference to the technologies being developed by companies like Google, Amazon, Apple, and more. This is not only a misnomer, it is doing a disservice to the public and the government – both trying to understand the technology’s capabilities and impact. By definition, artificial intelligence draws a comparison to human intelligence, but the fact is, computers are wired differently from humans, and therefore the intelligence they display that is enabled by ML isn’t exactly ‘artificial.’ Rather, it is a different kind of intelligence, machine intelligence, that is uniquely enabled by nearly infinite storage and compute in contrast to humans. The most valuable companies in the world have been innovating with machine intelligence for more than 20 years to develop better ways to interface with humans to ‘organize the world’s information’, ‘be Earth’s most customer-centric company’, and ‘bringing the best user experience to customers.’ Advances in ML are enabling new types of ‘machine intelligence’ that are fueling innovations for the world’s most valuable companies today as well as those to come. Leaders and businesses should be racing to build the best teams that understand and can leverage these technologies to build new products that will disrupt every business area.”
“Thoughts on string of recent data breaches.” Commentary by Zach Capers, Senior Analyst at Capterra and Gartner
“Data breaches are a top concern for data security teams given their financial and reputational ramifications. As evidenced by the breaches of Tesla and Discord, businesses must be aware of threats stemming from human factors. In these cases, it took a pair of disgruntled employees and one compromised customer support account to put the sensitive information of thousands at risk. A robust data classification program helps organizations avoid costly breaches that put sensitive data at risk. But the process of identifying and labeling various types of data is often overwhelming—and overengineered. Businesses should focus on implementing three fundamental levels of data classification, if possible, leveraging automation for data management over manual methods, and prioritizing security over compliance.”
“Thoughts on string of recent data breaches.” Commentary by Nikhil Girdhar, Senior Director of Data Security, Securiti
“The recent data breach involving Johnson & Johnson’s CarePath application underscores the pressing need for a tactical overhaul in healthcare data security. As the sector moves swiftly towards digitization, patient data becomes a prized asset for cybercriminals. This mandates a critical reassessment of Data Security Posture Management (DSPM) strategies across healthcare organizations. In an environment where patient data is dispersed across multiple platforms, the challenge for security teams—often operating with finite resources—is to effectively pinpoint and secure vulnerable assets. A data-centric approach can optimize resource allocation by focusing on high-value assets. This enables more precise application of safeguards such as least-privilege access controls, data masking, and configuration management, particularly for key applications like Carepath. The paradigm must also shift from an ‘if’ to a ‘when’ mindset regarding breaches. Prioritizing data encryption is not just advisable; it’s essential. Moreover, automating incident analysis can accelerate notifications to impacted parties, enabling them to take proactive measures to protect their information. When integrated, these steps forge a formidable defense against increasingly advanced cyber threats, offering security teams the tactical advantage they need.”
“AI and Synthetic Content.” Commentary by Tiago Cardoso, product manager at Hyland Software
“AI models training synthetic content already happens in many cases, replacing human feedback to scale training and fine-tuning, as there is no need for people to be in the loop. It is mostly used on smaller language models to improve performance and the main implication is allowing low-cost generative models that have high performance. Nevertheless, using synthetic content might lead to biased inflation. The bias of the model producing the content…”
Source link