Connect with us


Data science meets DevOps:
7 rules of machine learning



By: Yotam Yarden, senior data scientist at Amazon Web Services (AWS)

Machine Learning capabilities hold great potential for new revenue streams and tremendous cost savings for enterprises. Increasingly, businesses are using ML to strengthen their competitive advantage and drive innovation. Is your organization embracing this shift or are you falling behind? If you are on the “bias-for-action” side of the scale and have already started steering your organization towards digital & ML transformation, are you confident you are doing so in the right way?

Over the past decade, data has become increasingly important and has even been described as the “new oil”. Organizations with extensive user data can leverage data to increase sales and customer retention. Data of machinery can be leveraged to improve machines utilization of manufacturers. Computer Tomography images can be used to identify cancerous tumors. There is literally no industry segment which can’t leverage data to improve and create new business models. Meanwhile, data has never been easier and less expensive to collect, store, analyze, and share. Many enterprises are building their data lakes today precisely for this reason. But, is your organization taking full advantage of its data? Are you satisfied with the value you generate from your data? Do you struggle with building smart applications on top of your data lake? Big Data but not enough insights? Too much talking, not enough walking?

If so, consider the following tips:

  1. Be business driven and customer focused: What are your organization’s biggest challenges? Start from a focused business challenge and work backward towards a solution. Too many companies try to apply “self-driving cars” or “genome-sequencing” algorithms to a sales funnel optimization challenge just because they hired an expert in this field, while often there are models that better fit the task and bring higher value at lower costs. Don’t keep your data science team in the IT department alone. Rather, giving ownership of the data science team to a business stakeholder can invigorate your organization, and unlock new revenue streams and tremendous cost savings.
  2. Iterate fast and simple: Be quick and decisive about bringing your ML system into production. Conducting small iterations through tests, proof of concepts and pilots will help your team to bring ML workloads into production faster, and in a higher quality. Plan to have a production-ready prototype in 3 weeks, and a fully operational version in under 90 days. Even if your system is not using the state-of-the-art model, you will learn far more by iterating quickly than you would from an overly-long development cycle. ML transformations happen by building knowledge and experience through small, fast, and simple steps, rather than by multiple year planning. A redesign is inevitable. Only by experimentation, experience, and adaptation, can you realize the full potential of your ML product. Fail fast and improve often.
  3. Centralize or Decentralize ML teams? Centralize ML teams when necessary, but aim to decentralize when possible. ML applications, like any other piece of software, require maintenance, updates, and support. A centralized team may be effective at low-scale, but once you start expanding, innovation might suffer. Imagine a large innovation team who is working on multiple innovative projects, it is inevitable that at some point a substantial portion of the team’s work would be operating ongoing projects. It then might be a good time to distribute the team to its real home, within the business unit that it serves. It can be hard to ”give away” your “baby”, but it will help your ML team innovate on behalf of your customers.
  4. Consider the biggest roadblocks for data scientists & developers[1]: 1) dirty data, e.g. data sets which are unstructured, have missing attributes, and mixed data types in the same section; 2) lack of talent; 3) lack of management or financial support, as ML projects require focus and funding, organizations struggle to roll-out such a project without its management’s support; 4) lack of a clear questions to answer. Organizations are chasing improvement but are lacking specifications and clear targets to achieve them; 5) data not available or difficult to access. If you plan appropriately, you will find that most of these roadblocks are easily overcome. Lack of talent? Start hiring talent ahead of demand rather than have the data waiting for talent. Data not available? Start collecting data in advance of the project kick-off. Data not accessible? Don’t kick-off a workshop without first obtaining relevant data samples. Lack of management or financial support? Get the buy-in in advance. Find the stakeholders’ heroes who are enthusiastic about AI and can support you with budget & headcount approvals, data accessibility, and connections to other business stakeholders.
  5. The separation between Data Science and DevOps is over! “Our PhDs develop ML models and write specifications for our developers to implement in C++.” If you can relate to this customer quote, start changing your team’s structure today. There is a wide range of tools that enable data scientists to take a step towards engineering, and vice-versa. The separation of “science” and “production” can prolong your company’s development & innovation cycles, thus leading to quality and ownership issues. Thankfully, technology is evolving at an increasing pace and new tools are continually released. It has never been easier for experts to expand their capabilities and cross over into new domains.
  6. Keep the right Data Scientists/Data Engineers ratio: What is the optimal Data Scientists/Data Engineers ration? For most customers, the answer will depend on the maturity of the business. If your data are not accessible or you don’t maintain and track your data, you will likely need more engineering and less science. On the other hand, if you already have an established data pipeline, data warehouse, and data lake, you will likely want more science and less engineering. In some cases, your business will have specific requirements, which can affect the skills needed as well. As a rule of thumb, plan to have 2-3 engineers for every data scientist in the building phase, and 1:1 when a system is already deployed.
  7. Have clear KPIs (Key Performance Indicators)by which your project’s success can be measured. For example, imagine a Recommendation Engine project for an online media company. “Enhance user experience” might be a great goal, but without a way to measure success, this objective is overly ambiguous. Stakeholders might even disagree over whether the goal has been met, which can cause wasted resources and inefficient development. Can “enhancing the user experience” be measured by time spent on the platform? The number of videos watched? The number of new categories explored by the user? Each measure could lead to a different recommendation system.

Having clear goals & KPIs will help you plan and execute more effectively:

ML initiatives are exciting and can be extremely fruitful. However, lack of focus, limited resources, and improperly set of expectations can cause anxiety. Holding a “ML Discovery Workshop”, in which all stakeholders, both business and technical, brainstorm ideas, discuss their company’s biggest challenges, and plan can help enormously. During the workshop list all of your biggest challenges, their feasibility, estimated efforts, and missing skills and tools, and come up with a list of projects and a concrete execution plan. However, even the most well-intended execution plan will flounder without proper focus. With this in mind, remember: Be Customer Focused, Iterate Fast, Distribute data science when effective, Plan for roadblocks, Staff appropriately, and Choose specific KPIs that matter.

The writer is a senior data scientist at AWS and has been helping enterprises with their machine learning and cloud journey.


ConceptD: Creatives get a tech brand of their own

The unveiling of a new brand by Acer recognises the massive computing power needed in creative professions, writes ARTHUR GOLDSTUCK



It’s a crisp Spring morning in Brooklyn. The regular water taxi from Manhattan pulls up at Duggal Greenhouse on the edge of the East River. It’s a building that symbolises the rejuvenation of Brooklyn as a hub of artistic and creative expression.

Inside the vast structure, global computer brand Acer is about to unveil its own tribute to creativity. Company CEO Jason Chen takes to the stage in faded blue jeans and brown t-shirt, underlining the connection of the event to the informality of the area.

“Brooklyn is become more and more diverse,” he tells a gathering of press from around the world, attending the Next@Acer media event. “It’s an area that is up and coming. It represents new lifestyles. And our theme today is turning a new chapter for creativity.”

Every year, Next@Acer is a parade of the cutting edge in gaming and educational laptops and computers. New devices from sub-brands like Predator, Helios and Nitro have gamers salivating. This year is no different, but there is a surprise in store, hinted in Chen’s introduction.

As a grand finale, he calls on stage Angelica Davila, whose day job is senior marketing manager for Acer Latin America. But she also happens to have a Masters degree in computer and electric engineering. A stint at Intel, where she joined a sales and marketing programme for engineers, set her on a new path.

Angelica Davila, marketing manager for Acer Latin America

For the last few months, she has been helping write Acer’s next chapter. She has shepherded into being nothing less than a new brand: ConceptD.

Click here to read more about ConceptD.

Previous Page1 of 3

Continue Reading


Which voice assistant wins battle of translators?



Take the most famous phrase from the Godfather – “I’m going to make him an offer he can’t refuse” – or “The only thing we have to fear is fear itself” from the inaugural address of US President Franklin Delano Roosevelt and see just how the virtual assistants do in translating them using their newly introduced Neural Machine Translation (NMT) capabilities. One Hour Translation (OHT), the world’s largest online translation service, conducted a study to find out just how accurate these new services are.

OHT used 60 sentences from movies and famous people ranging from the Godfather and Wizard of Oz to Neil Armstrong, the first man to set foot on the moon, US presidents Franklin Delano Roosevelt and John Fitzgerald Kennedy and historical figures like Leonardo da Vinci and Aesop. The sentences were translated by Google Assistant, Amazon’s Alexa and Apple’s Siri from English to French, Spanish, Chinese and German and then given to five professional translators for their assessment on a scale of 1-6. 

Google Assistant scored highest in three of the four languages surveyed – English to French, English to German and English to Spanish and second in English to Chinese.  Amazon’s Alexa, whose translation engine is powered by Microsoft Translator, was tops in the English to Chinese category. Apple’s Siri was second place in English to French and English to Spanish and third place in English to German and English to Chinese.  (See chart). All three virtual assistants are compatible with mobile phones.

“The automated assistants’ translation quality was relatively high, which means that assistants are useful for handling simple translations automatically,” says Yaron Kaufman, chief marketing officer and co-founder of OHT. He predicts that “there is no doubt that the use of assistants is growing rapidly, is becoming a part of our lives and will make a huge contribution to the business world.” 

A lot will depend on further improvements in NMT technology, which has revolutionized the field of translation over the past two years.  All the companies active in the field are investing large sums as part of this effort. “OHT is working with several of the leading NMT providers to improve their engines through the use of its hybrid online translation service that combines NMT and human post-editing,” notes Kaufman. He adds that this will no doubt have a huge impact on the use of assistants for translation purposes.

OHT has made a name for itself in assessing the level of translations by NMT engines.  Its ONEs Evaluation Score is a unique human-based assessment of the leading NMT engines conducted on a quarterly basis and used as an industry standard. 

Continue Reading


Copyright © 2019 World Wide Worx