Connect with us


Data science meets DevOps:
7 rules of machine learning



By: Yotam Yarden, senior data scientist at Amazon Web Services (AWS)

Machine Learning capabilities hold great potential for new revenue streams and tremendous cost savings for enterprises. Increasingly, businesses are using ML to strengthen their competitive advantage and drive innovation. Is your organization embracing this shift or are you falling behind? If you are on the “bias-for-action” side of the scale and have already started steering your organization towards digital & ML transformation, are you confident you are doing so in the right way?

Over the past decade, data has become increasingly important and has even been described as the “new oil”. Organizations with extensive user data can leverage data to increase sales and customer retention. Data of machinery can be leveraged to improve machines utilization of manufacturers. Computer Tomography images can be used to identify cancerous tumors. There is literally no industry segment which can’t leverage data to improve and create new business models. Meanwhile, data has never been easier and less expensive to collect, store, analyze, and share. Many enterprises are building their data lakes today precisely for this reason. But, is your organization taking full advantage of its data? Are you satisfied with the value you generate from your data? Do you struggle with building smart applications on top of your data lake? Big Data but not enough insights? Too much talking, not enough walking?

If so, consider the following tips:

  1. Be business driven and customer focused: What are your organization’s biggest challenges? Start from a focused business challenge and work backward towards a solution. Too many companies try to apply “self-driving cars” or “genome-sequencing” algorithms to a sales funnel optimization challenge just because they hired an expert in this field, while often there are models that better fit the task and bring higher value at lower costs. Don’t keep your data science team in the IT department alone. Rather, giving ownership of the data science team to a business stakeholder can invigorate your organization, and unlock new revenue streams and tremendous cost savings.
  2. Iterate fast and simple: Be quick and decisive about bringing your ML system into production. Conducting small iterations through tests, proof of concepts and pilots will help your team to bring ML workloads into production faster, and in a higher quality. Plan to have a production-ready prototype in 3 weeks, and a fully operational version in under 90 days. Even if your system is not using the state-of-the-art model, you will learn far more by iterating quickly than you would from an overly-long development cycle. ML transformations happen by building knowledge and experience through small, fast, and simple steps, rather than by multiple year planning. A redesign is inevitable. Only by experimentation, experience, and adaptation, can you realize the full potential of your ML product. Fail fast and improve often.
  3. Centralize or Decentralize ML teams? Centralize ML teams when necessary, but aim to decentralize when possible. ML applications, like any other piece of software, require maintenance, updates, and support. A centralized team may be effective at low-scale, but once you start expanding, innovation might suffer. Imagine a large innovation team who is working on multiple innovative projects, it is inevitable that at some point a substantial portion of the team’s work would be operating ongoing projects. It then might be a good time to distribute the team to its real home, within the business unit that it serves. It can be hard to ”give away” your “baby”, but it will help your ML team innovate on behalf of your customers.
  4. Consider the biggest roadblocks for data scientists & developers[1]: 1) dirty data, e.g. data sets which are unstructured, have missing attributes, and mixed data types in the same section; 2) lack of talent; 3) lack of management or financial support, as ML projects require focus and funding, organizations struggle to roll-out such a project without its management’s support; 4) lack of a clear questions to answer. Organizations are chasing improvement but are lacking specifications and clear targets to achieve them; 5) data not available or difficult to access. If you plan appropriately, you will find that most of these roadblocks are easily overcome. Lack of talent? Start hiring talent ahead of demand rather than have the data waiting for talent. Data not available? Start collecting data in advance of the project kick-off. Data not accessible? Don’t kick-off a workshop without first obtaining relevant data samples. Lack of management or financial support? Get the buy-in in advance. Find the stakeholders’ heroes who are enthusiastic about AI and can support you with budget & headcount approvals, data accessibility, and connections to other business stakeholders.
  5. The separation between Data Science and DevOps is over! “Our PhDs develop ML models and write specifications for our developers to implement in C++.” If you can relate to this customer quote, start changing your team’s structure today. There is a wide range of tools that enable data scientists to take a step towards engineering, and vice-versa. The separation of “science” and “production” can prolong your company’s development & innovation cycles, thus leading to quality and ownership issues. Thankfully, technology is evolving at an increasing pace and new tools are continually released. It has never been easier for experts to expand their capabilities and cross over into new domains.
  6. Keep the right Data Scientists/Data Engineers ratio: What is the optimal Data Scientists/Data Engineers ration? For most customers, the answer will depend on the maturity of the business. If your data are not accessible or you don’t maintain and track your data, you will likely need more engineering and less science. On the other hand, if you already have an established data pipeline, data warehouse, and data lake, you will likely want more science and less engineering. In some cases, your business will have specific requirements, which can affect the skills needed as well. As a rule of thumb, plan to have 2-3 engineers for every data scientist in the building phase, and 1:1 when a system is already deployed.
  7. Have clear KPIs (Key Performance Indicators)by which your project’s success can be measured. For example, imagine a Recommendation Engine project for an online media company. “Enhance user experience” might be a great goal, but without a way to measure success, this objective is overly ambiguous. Stakeholders might even disagree over whether the goal has been met, which can cause wasted resources and inefficient development. Can “enhancing the user experience” be measured by time spent on the platform? The number of videos watched? The number of new categories explored by the user? Each measure could lead to a different recommendation system.

Having clear goals & KPIs will help you plan and execute more effectively:

ML initiatives are exciting and can be extremely fruitful. However, lack of focus, limited resources, and improperly set of expectations can cause anxiety. Holding a “ML Discovery Workshop”, in which all stakeholders, both business and technical, brainstorm ideas, discuss their company’s biggest challenges, and plan can help enormously. During the workshop list all of your biggest challenges, their feasibility, estimated efforts, and missing skills and tools, and come up with a list of projects and a concrete execution plan. However, even the most well-intended execution plan will flounder without proper focus. With this in mind, remember: Be Customer Focused, Iterate Fast, Distribute data science when effective, Plan for roadblocks, Staff appropriately, and Choose specific KPIs that matter.

The writer is a senior data scientist at AWS and has been helping enterprises with their machine learning and cloud journey.


Revealing the real cost of ‘free’ online services

A free service by Finnish cybersecurity provider F-Secure reveals the real cost of using “free” services by Google, Apple, Facebook, and Amazon, among others.



What do Google, Facebook, and Amazon have in common? Privacy and identity scandals. From Cambridge Analytica to Google’s vulnerability in Google+, the amount of personal data sitting on these platforms is enormous.

Cybersecurity provider F-Secure has released a free online tool that helps expose the true cost of using some of the web’s most popular free services. And that cost is the abundance of data that has been collected about users by Google, Apple, Facebook, Amazon Alexa, Twitter, and Snapchat. The good news is that you can take back your data “gold”.

F-Secure Data Discovery Portal sends users directly to the often hard-to-locate resources provided by each of these tech giants that allow users to review their data, securely and privately.

“What you do with the data collection is entirely between you and the service,” says Erka Koivunen, F-Secure Chief Information Security Officer. “We don’t see – and don’t want to see – your settings or your data. Our only goal is to help you find out how much of your information is out there.”

More than half of adult Facebook users, 54%, adjusted how they use the site in the wake of the scandal that revealed Cambridge Analytica had collected data without users’ permission.* But the biggest social network in the world continues to grow, reporting 2.3 billion monthly users at the end of 2018.**

“You often hear, ‘if you’re not paying, you’re the product.’ But your data is an asset to any company, whether you’re paying for a product or not,” says Koivunen. “Data enables tech companies to sell billions in ads and products, building some of the biggest businesses in the history of money.”

F-Secure is offering the tool as part of the company’s growing focus on identity protection that secures consumers before, during, and after data breaches. By spreading awareness of the potential costs of these “free” services, the Data Discovery Portal aims to make users aware that securing their data and identity is more important than ever.

A recent F-Secure survey found that 54% of internet users over 25 worry about someone hacking into their social media accounts.*** Data is only as secure as the networks of the companies that collect it, and the passwords and tactics used to protect our accounts. While the settings these sites offer are useful, they cannot eliminate the collection of data.

Koivunen says: “While consumers effectively volunteer this information, they should know the privacy and security implications of building accounts that hold more potential insight about our identities than we could possibly share with our family. All of that information could be available to a hacker through a breach or an account takeover.”

However, there is no silver bullet for users when it comes to permanently locking down security or hiding it from the services they choose to use.

“Default privacy settings are typically quite loose, whether you’re using a social network, apps, browsers or any service,” says Koivunen. “Review your settings now, if you haven’t already, and periodically afterwards. And no matter what you can do, nothing stops these companies from knowing what you’re doing when you’re logged into their services.”

***Source: F-Secure Identity Protection Consumer (B2C) Survey, May 2019, conducted in cooperation with survey partner Toluna, 9 countries (USA, UK, Germany, Switzerland, The Netherlands, Brazil, Finland, Sweden, and Japan), 400 respondents per country = 3600 respondents (+25years)

Continue Reading


WhatsApp comes to KaiOS



By the end of September, WhatsApp will be pre-installed on all phones running the KaiOS operating system, which turns feature phones into smart phones. The announcement was made yesterday by KaiOS Technologies, maker of the KaiOS mobile operating system for smart feature phones, and Facebook. WhatsApp is also available for download in the KaiStore, on both 512MB and 256MB RAM devices.

“KaiOS has been a critical partner in helping us bring private messaging to smart feature phones around the world,” said Matt Idema, COO of WhatsApp. “Providing WhatsApp on KaiOS helps bridge the digital gap to connect friends and family in a simple, reliable and secure way.”

WhatsApp is a messaging tool used by more than 1.5 billion people worldwide who need a simple, reliable and secure way to communicate with friends and family. Users can use calling and messaging capabilities with end-to-end encryption that keeps correspondence private and secure. 

WhatsApp was first launched on the KaiOS-powered JioPhone in India in September of 2018. Now, with the broad release, the app is expected to reach millions of new users across Africa, Europe, North America, Southeast Asia, and Latin America.

“We’re thrilled to bring WhatsApp to the KaiOS platform and extend such an important means of communication to a brand new demographic,” said Sebastien Codeville, CEO of KaiOS Technologies. “We strive to make the internet and digital services accessible for everyone and offering WhatsApp on affordable smart feature phones is a giant leap towards this goal. We can’t wait to see the next billion users connect in meaningful ways with their loved ones, communities, and others across the globe.”

KaiOS-powered smart feature phones are a new category of mobile devices that combine the affordability of a feature phone with the essential features of a smartphone. They meet a growing demand for affordable devices from people living across Africa – and other emerging markets – who are not currently online. 

WhatsApp is now available for download from KaiStore, an app store specifically designed for KaiOS-powered devices and home to the world’s most popular apps, including the Google Assistant, YouTube, Facebook, Google Maps and Twitter. Apps in the KaiStore are customised to minimise data usage and maximise user experience for smart feature phone users.

In Africa, the KaiOS-powered MTN Smart and Orange Sanza are currently available in 22 countries, offering 256MB RAM and 3G connectivity.

KaiOS currently powers more than 100 million devices shipped worldwide, in over 100 countries. The platform enables a new category of devices that require limited memory, while still offering a rich user experience.

* For more details, visit: Meet The Devices That Are Powered by KaiOS

* Also read Arthur Goldstuck’s story, Smart feature phones spell KaiOS

Continue Reading


Copyright © 2019 World Wide Worx