Monthly Archives: January 2022

Graviti Launches Data Platform Addressing the Needs of Working with Large Volumes of Unstructured Data

Graviti, a New York based modern data infrastructure startup that has been in stealth mode for the past three years, is today launching its first product – Graviti Data Platform. Graviti Data Platform is designed to eliminate one of the costliest and confounding problems faced by developers of artificial intelligence (AI) applications worldwide: working with large volumes of unstructured data.

Best Practices in Data Engineering: Brush Up Your Skills and Tidy Your Data with DIY Data

[SPONSORED POST] Trifacta introduces “DIY Data” – a unique webcast series that presents practical aspects of data engineering through hands-on demonstrations.  The series is all about being hands-on with Trifacta through 30-min byte size live and interactive episodes.

High-Performance Computing Essentials … And Why Analytics is Forever Grateful

In this contributed article, editorial consultant Jelani Harper discusses how the basics of High-Performance Computing (HPC) are fairly well known by those acquainted with even a modest amount of computer science. Instead of relying on a single machine for computations, HPC approaches involve multiple machines, which produce two notable effects that directly benefit analytics.

A Perspective – Data Privacy Day 2022

In this contributed article, Indu Peddibhotla is Senior Director, Products and Strategy, at Commvault’s SaaS venture, Metallic, discusses how IT professionals have become key innovators tasked with ensuring that organizations and businesses and are not only protecting their data but complying with privacy regulations. It’s through implementing intelligent data management solutions that IT professionals can be most successful.

How Better Data Management Can Help SMBs Predict the Future

In this special guest feature, Didi Gurfinkel, Co-Founder and CEO of DataRails, discusses the potential benefits of data for SMB predictive insights. Equipped with predictive analytics and data consolidation platforms, SMBs can streamline their data management and unlock vital insights, making it easier for business leaders to develop contingency plans in abnormal times like these – and making it harder for events beyond a business’s control to upend everything.

insideBIGDATA Guide to Computer Aided Engineering – Part 2

[SPONSORED POST] The essential first step for manufacturers is to consider how much data the enterprise has at its disposal. Most manufacturers collect vast troves of process data but typically use it only for tracking purposes, not as a basis for improving operations. The challenge is for these players to invest in the systems and skillsets that will allow them to enhance their use of existing process statistics. This Guide, “insideBIGDATA Guide to Computer Aided Engineering,” sponsored by Dell Technologies, will walk through some of the ways to expand the scope of analytics to further increase business value.

How data science can solve Telco’s energy problem

The following are two major topics in the minds of CxOs of any Telco operator:

  • 5G: Accelerating 5G to give a better experience to the customer and the possibility of increased revenue growth/market share.
  • Energy management: Worrying about the carbon emissions and energy consumption — not only incurring costs for the latter but also the resulting additional costs for carbon offsets.

According to McKinsey [1], energy costs look set to increase further and may account for as much as 5-7% of operating expenditures.

With more advancement in technology for sustainability and green energy, one can reduce energy consumption by using infrastructure that is been designed to require less energy. However, for already deployed infrastructure — especially for older technologies like 3G and 4G — this may not be an ROI-viable approach.

For towers already installed and in use, the viable option is to find means to reduce the energy consumed by these towers. Luckily, these Telco systems have a built-in feature precisely for this challenge. Nokia calls it “Power Savings Mode,” and Ericsson calls it “Cell Sleep Mode” (CSM).

Conceptually, both are the same. The idea is rooted in power consumption remaining the same regardless of the utilization of the various layers (for the most part). These power-saving features make use of this fact and program the cell layers to sleep when the utilization is low, resulting in energy savings.

One downside to this approach is that when this is done at the wrong time or in unfavourable conditions, it may have an impact on the customer experience. For instance, customers streaming HD videos might experience a slow response or buffering, which may then affect their viewing experience. One option to counteract this is by setting the thresholds very low so that the chances of impacting the customer experience will be less likely, but the downside to this approach is the lost opportunity in power savings. On the other hand, setting the thresholds higher means more cell layers will be put to sleep for a longer period of time, potentially impacting the customer experience.

The ideal solution is to enable the sleep mode for these layers when the utilization is low and the impact on the customer experience won’t be noticeable, and this requires knowing the utilization and other conditions of these towers in advance. In other words, the challenge is to determine the utilization of each cell layer for the next few weeks at a very granular interval and, most importantly, to determine the impact on customer experience for the same period at similar granularity.

Applying this strategy to a customer project

The Data Science and AI Elite team had an opportunity to work with a large telecom operator who was keen on applying machine learning to reduce cell tower energy consumption with minimal impact on their customers’ experience. Their current approach of manually setting thresholds was not scaling given the dynamic change in lifestyle and work behavior of their customers (especially as more people were working from home over the last two years).

For instance, cell towers in the Central Business District areas were underutilized while people were working from home, while network utilization in residential areas remained relatively high late into the night as behavioral patterns were changing. These changes meant that existing power optimization thresholds were not effective and had to be updated.

Our solution consisted of two key components:

  • A network traffic forecast model
  • An optimization model

One key challenge was quantifying customer impact and establishing a common scale between cost savings (in dollar value) and impact on customer experience.

Figure 1: End-to-end pipeline for forecasting and optimization.

Figure 1: End-to-end pipeline for forecasting and optimization.

Other challenges we faced included the following:

  • Extreme difficulty in monitoring customer impact because there were many factors involved, including the device used and the number of carrier aggregation, the surrounding devices connected to the cell tower, how the cell layers are configured, etc.
  • A highly subjective and debatable correlation of dollar value to customer experience.

The solution to the challenge will be unique to each telecom operator and will largely depend on the type of monitoring tools they have and their acceptance of the quantification of customer impact.

The first component is the forecasting model, which forecasts the network utilization for each cell layer at an hourly interval. This forecast provides the optimization model with an idea of what the traffic looks like so optimal decisions can be made.

The second component is the optimization model, which determines the schedule on which cell layers should be put to sleep and at which hour. The objective function is to minimize the number of operating cell hours subjected to various constraints. These could be business constraints (e.g., having at least two coverage cell layers always switched on) or technical constraints (e.g., the average sector load cannot exceed 60%).

A differentiating factor in this solution is the consideration of customer impact when cell layers are switched off. The key idea is to find the optimal balance between cost savings and customer impact. Each telecom operator will have their own preferred way of approximating customer impact, and this is usually the most challenging part.

In our case, we used throughput volume from non-carrier aggregated connected devices. This allowed us to approximate the impact when a cell layer is switched off, as that device is only connected to that cell layer. This is an extremely important piece of information because it gives us an idea of the type of activity in which the user is engaged; for example, there is a difference between one device consuming 5GB of data volume or five devices consuming 1GB each of data volume.

RAN power savings using data science

Cell Sleep Mode overview

Cell Sleep Mode (CSM) has various parameters that govern the schedule of the cell. One set of key parameters is the sleep threshold and wakeup threshold. As their name suggests, the sleep threshold decides when the cell layer should be put to sleep, and the wakeup threshold decides when the cell layer should be woken up. When the cell sleep mode is enabled for a cell layer, all these parameters decide the state of that cell layer.


Telcos literally have tens of thousands of cell layers. Developing a forecasting model for each cell layer would simply result in too many models and is untenable. The clustering of cell layers based on utilization and the development of a forecasting model for each segment will reduce the number of forecasting models.

In addition, custom similarity metrics could also be applied to group more heterogeneous cell layers together. For example, instead of using the typical Euclidean distance, additional terms could be added to compare the direction of change so that cell layers with similar trends are grouped.


The forecasting model must predict the PRB utilization for each cell layer for the next couple of weeks on an hourly basis. This forecasted PRB utilization of each cell layer is one of the key inputs to the Decision Optimization model. So, it is imperative to build a highly accurate forecasting model.

Time series forecasting is a standard technique. There are many techniques to choose from, including traditional statistical types (e.g., ARIMA), machine learning types (e.g., xgboost) and deep learning types (e.g., NBEATS). Any of these will work; the main difference will most likely be the accuracy, and this will depend on the type of time series pattern the cell layer network utilization is exhibiting.

With the historical PRB utilization data segmented, we built an ensemble of forecasting models. Some of the key exogenous features that were included in the model were a list of holidays and COVID severity (count of active cases).

The forecasting model could further be enriched by including other features like network change information, marketing campaign information, network outage/service information, major events, etc.,

This fine-grained forecasted PRB utilization data is one of the key inputs to Decision Optimization model. The forecasted PRB data can be used as incremental information to guide adjacent business functions, such as preventive maintenance scheduling, capacity upgrades to towers, energy invoice reconciliation and more.

Customer impact analysis

Approximating customer impact when a cell layer is switched off is the most challenging aspect of this project. First, different telecom providers have different definitions of customer impact. Second, it is extremely difficult to quantify and measure the impact due to many changing factors like time of day or number of devices connected.

After several rounds of discussion, we decided to use non-carrier aggregated volume as a proxy to quantify the impact of a cell layer when it is put to sleep. The key idea is based on the idea that if earlier power-savings trials are running well and there are no complaints, we can infer the maximum non-carrier aggregated volume that is acceptable based on the current power-savings thresholds. At a high level, this can be accomplished in three steps:

  1. Calculate average sector load for each hour across a specific period.
  2. Select those time periods where sector load is below a specific threshold.
  3. Use the maximum volume for these time periods to achieve the maximum acceptable impact loss.

The maximum acceptable impact loss in megabytes could then be a constraint for that specific cell layer and hour.

Decision Optimization model

The final step is to use the forecasted network traffic as an input along with operation requirements and customer impact as constraints to formulate the optimization model to generate a schedule to put cell layers to sleep. The objective function of the optimization model is to minimize the number of cell layer operating hours subjected to various constraints, such as the following:

  • Limit each cell layer to a maximum of 80% utilization.
  • Ensure that at least one base cell layer is always switched on.
  • Ensure total forecasted network traffic is met by cell layers.
  • Ensure non-carrier aggregated volume does not exceed the threshold.

In addition, the model needs to handle how the network traffic will be distributed when a cell layer is switched off. As there are many factors involved — such as the number of devices connected and the type of activities — we have used a conservative approach of assuming the entire load for the cell layer is replicated to the other cell layers. This ensures that the cell site will be to handle the additional load when cell layers are put to sleep.

With these inputs to the model, the output will be a schedule for each cell layer at an hourly level that decides if it should be put to sleep mode or not.


Figure 2: Trial results.

Figure 2: Trial results.

A field trial of our models was conducted on a restricted number of cell sites. The purpose of the field trial was to measure estimated cost savings. The result seemed very promising, where both the forecasting model and Decision Optimization model had great results. That resulted in an estimated cost savings of about 15-25%, and the customer experience impact remained low. There is a possibility of achieving higher cost savings with a slightly more customer experience impact.


Our data science solution is to optimize the tradeoff between the power/energy savings of the cell towers and the impact on customer experience. This capability provides the right lever for businesses to find the tradeoff between a reduction in cost and customer impact. With the “new normal,” where the work-from-home percentage may be much higher, our solution will help the Telco operators to optimize their right network operational planning, resulting in cost savings.

Learn more about the Data Science and AI Elite team.

The post How data science can solve Telco’s energy problem appeared first on Journey to AI Blog.

insideBIGDATA Latest News – 1/26/2022

In this regular column, we’ll bring you all the latest industry news centered around our main topics of focus: big data, data science, machine learning, AI, and deep learning. Our industry is constantly accelerating with new products and services being announced everyday. Fortunately, we’re in close touch with vendors from this vast ecosystem, so we’re in a unique position to inform you about all that’s new and exciting. Our massive industry database is growing all the time so stay tuned for the latest news items describing technology that may make you and your organization more competitive.

anch.AI, former AI Sustainability Center, Secures $2.1M in Seed Funding to Launch Ethical AI Governance Platform

Against the rising tide of regulation, anch.AI has released the first horizontally integrated ethical AI governance platform, a one-stop shop for businesses to accelerate responsible AI adoption across their organization. The B2B SaaS startup emerged from the AI Sustainability Center, a Swedish think tank, and has secured $2.1M in seed funding to further develop and launch their pioneering risk assessment platform.

The Shift to NLP-powered BI Will Unlock Its True Potential for Business Users

In this contributed article, Marcos Monteiro, CEO and co-founder of Veezoo, discusses how as AI and NLP technology evolves, the capabilities of natural-language interfaces to data become exponentially more powerful and there will be fewer use cases where traditional interfaces will remain superior.