Video, Data, Analytics: Leveraging IPTV Experience for AI

I recently transitioned from Product Management in IPTV (TV / Video over the internet) Analytics to running Product Management at an AI company. Analytics, Data Science, and AI are very connected fields, but few people realize how much of the architectural knowledge from IPTV is uniquely relevant to applications of AI.

To me, it felt obvious that someone who understands the “nuts and bolts” of IPTV (video, Petabyte-scale data storage, and temporal analytics) has three of the primary pillars for success in the AI industry.

Video

The “TV” part of IPTV is fundamentally about video. Many of the most prominent uses for AI are also Video or Camera-based. Self-driving cars, object recognition, facial recognition, and everything classified as computer vision requires an understanding of video. Not only from the stance of “is this an image?” but also “why does this image look the way it does?” IPTV deals with disparate frame cadences (50 interlaced fields per second PAL content displaying on a 30 Hz Macbook display, or a 24 Hz Android Tablet, for example), or fish-eye corrected wide-angle shots as regular challenges to making video look good for the user. Making video look good for the AI is just as important. GiGo (Garbage in: Garbage out) applies as much to AI as it does to any other decision-making process. If a self-driving car expects 30 fps video but receives 60 fps video it will think it is driving half as fast as it actually is -- which will have catastrophic outcomes.

Petabyte-Scale Data Storage

Many of the people doing AI feel like they are “doing Big Data.” They aren’t. I measure data in IPTV scale. A Terabyte is not “Big Data;” Petabytes is Big Data. Unfortunately today, many AI tasks are trained on a few Megabytes of data, and that’s why they fail to work at scale, or will ultimately become self-limiting. Let me be clear: if you are building an AI/ML product or service around a dataset measured in Megabytes, not Terabytes, it is unlikely you are building anything of significant value. In my role at Recognant we keep a copy of all of the English Internet's text updated about every 2 weeks. That makes us one of the few AI shops that deals with data at “webscale.” Therefore it made a lot of sense to bring my background in massive-scale data to the company.

Because of the scale of data, managing the infrastructure for AI is as important as the AI itself. Working in IPTV, I had to have an understanding of large NAS, SAN, and Database deployments, which aren’t common in many other fields. Video is big, and storing all of that video, delivering it to the edge, and doing so reliably is analogous to one of the largest challenges in AI.

When working in AI the first step is to do a “sift” -- removing as much data as possible to make the AI training as simple as possible. For example, if I’m building a clothing recommendation engine, I know that the apartment number of a customer doesn’t impact which color dress they are likely to buy. So that information can be safely removed from the data before processing. As you sift data you need to keep multiple copies and optimize for how that data is fed into the AI as a training set. Note: here I am using Machine Learning and AI interchangeably; however, at this stage it is the Machine Learning system that is determining patterns in the data, and it will be the AI that later acts on those patterns.

The speed at which data can be fed into the ML system plays a huge role in how parallelized the processing can be. Spinning up 1,000 NVIDIA Cores to process data is simple, but if you don’t have the throughput to keep them from being disk-bound, your processing can easily be 3-4x what an optimized system would cost. As CPU is the most expensive part of ML, planning and optimization is imperative.

Temporal Analytics

IPTV is very Temporal Analytics driven. Some additional analytics are based on Metadata which is primarily text, but the majority of it is behavioral data which is temporal. Additionally, network data (which is primarily numeric) is stored with time stamps and is most useful when viewed temporally. The ability to recognize the strengths and weaknesses of each type of data, and how to normalize it across systems, is a skill which many in data science lack. IPTV is a bit unique: unlike most web analytics, which are limited to a few pages, IPTV data is inherently made up of millions of frames. Each frame then represents a few millisecond-window of time.

In fact, a typical season of a network TV show is almost exactly a million frames:

(16 episodes * 48 minutes per episode * 60 seconds per minute * 24 frames per second = 1,105,920 frames)

Each of those 1M+ frames is a reference point for data about user behavior. And while website analytics may have numerous elements, it is rare that a user has the option to stop, pause, rewind, fast forward, increase volume, decrease volume, mute, etc, times a million frames that a TV series will have. IPTV collective user data nearly always exceeds the size of the video files with the depth and volume of the data that is recorded.

Building AI

AI works best when the trainer (Data Scientist, Product Manager, Developer) already knows the answer, or roughly knows the mechanics of the problem and an ideal result. When teaching someone to ride a bike, we don’t just say, “here kid, push the pedals and don’t fall over.” As adults, we know how balance works and that the child should not stop moving; therefore we instruct and course correct accordingly. Similarly, AI/ML works best when it is given good direction to get started. Only then it can excel at filling in the gaps and scaling far beyond what humans could do. Take an IPTV example. Let’s say I’m building a movie recommendation engine (I have) - that is, a system that automatically recommends the content that person X is most likely to enjoy. If my movie library was 200 titles and I personally sat around for hours and watched all of those movies while mentally noting themes, genres, actors, and directors, I’d probably have a pretty good idea of what someone who watched movie A would enjoy based on common patterns and trends. But could I do that if I had 100k movies and needed to recommend to 1 million users?

Having been in the trend spotting business, many times I can look at data and find rough correlations. This allows the training I do with the AI platform to be focused on the simple insights first, and then once the “low hanging fruit” is identified, the micro-trends that create the deviations can be discovered. This concert of human and machine learning is what drew me to Recognant. Recognant doesn’t use Neural Network based AI, but instead is a Transparent Machine Learning system which allows for direct modification of the rubrics which it uses to make decisions. Because of this, I can “point” the learning portion of the AI at a solution by getting it close with my own abilities and letting it optimize further from there.

I am excited to explore this new chapter in my career. I feel that everything I have done up to this point has been building towards working in AI. As the industry is starting to embrace the push to automation of insights, converting my natural ability to create insights from data into training for synthetic trend spotting gives me and the products I am working on a distinct advantage.

About Freya Rajeshwar:

Freya is the Chief Product Officer at Recognant. She has almost a decade of experience in Product Management, DevOps, and Analytics. At Recognant, she works on a number of products, including ones to combat human trafficking. Outside of work, she partners with programs to promote Computer Science education for children.

#AI #ML #IPTV #Analytics #ProductManagement #BigData #DataScience #NeuralNetworks #Careers #WomenInTech #ArtificialIntelligence #MachineLearning #Video