Mobile Security Application |System Design | HLDs with use cases

DESIGN SCOPE

COMPONENTS DESCRIPTION

Authentication Module API

  • Collecting large volume of data
  • Analysing the collected data
  • Firebase Cloud Messaging for Android Users, and
  • Apple Push Notification service for IOS users.
  1. This API will GET all application logs, app settings, data usage logs, custom data features for web crawler, and file scan statistics for malware, etc for every x hours (let’s say 3 hours )
  2. The Daily Feed generator will further create a daily file of logs by appending the hourly data.
  3. This daily-file will be processed and analysed using number of sophisticated techniques and security features. Rank and cumulative score will also be applied to each app, and if the rank/score is higher than the threshold limit, our system will alert the user for suspicious activity.

File and Database Design

  1. Data Storage in this design is an essential feature, as it will be handling all types of data such as -
  • User Information
    User Data usage
  • Application logs
  • Malicious objects
  • Quarantine data
  • Test Database which contains all the pre-stored training data which will be used in filtering phishing emails, processing spam SMS, insecure wifi detecting rules, etc
  • Database for logs and monitoring
  • Database for machine learning algorithms and data analytics
  • Database for filtering insecure web surfing, etc…
  1. The proposed solution for data storage is using cloud technologies like Amazon S3 and Hadoop, HDFS, Hive and PIG for big data processing.
  2. These services can help in providing replicas for databases, automatic sync, security and file encryption and reliable tech support.
  3. Database should also use replication and sharding so as to increase throughput of the database.
  4. Integrated databases should also guarantee either strong and eventual consistency, depending on the requirements.
  5. Database indexing should also be done in an optimised manner in order to speed up read queries.
  1. This database will be used to store all the user information including accounting profile device information and credentials in encrypted form. The authentication module API will use this database for authorisation and authentication of the users.
  2. This DB will be used to display the data in the UI console for system administrators. The user information will be stored in key-value pairs for faster searching using concept of hashing.
  3. For this purpose, using etcd database for our system design is an optimal choice. Etcd is a highly available key-value store which can be used for persistent storage.It has high access control and can be accessed by only using API in master node, nodes in other cluster do not have access to etcd store.

UI CONSOLE DESIGN

SYSTEM-HIGH LEVEL DESIGNS

  • Malicious Apps
  • Insecure WiFi
  • Insecure Web-Surfing
  • Phishing Emails
  • Phishing via other channels

Malicious Apps

Insecure wi-fi

  1. This product has Encryption level checker module in it. If the device connects to a new wifi network, it automatically checks for the encryption level used by the Wi-Fi network, and notifies you if it is insecure.Most of the public Wi-Fi networks normally use the ‘WEP’ open authentication that is insecure. This type of encryption has many security flaws that can cause user’s personal information, to be seen.
  2. It can use Firewall/VPN Gateway to monitor the inbound/outbound traffic, as per the requirements.On the outbound side, firewalls can be configured to prevent users from sending certain types of emails or transmitting sensitive data outside the network.
  3. Firewall can also be useful in case of banking transactions, when we don’t want any sensitive information to be leaked.It can allow all traffic to pass through except data that meets a predetermined set of criteria, or it can prohibit all traffic unless it meets a predetermined set of criteria.

Insecure web-surfing

  1. This system will test a number of websites across a number of security features, and rank and score them and store them in etcd database. etcd is a strongly consistent, distributed key-value store that provides a reliable way to store data that needs to be accessed by a distributed system or cluster of machines. This DB is continuously updated by monitor all existing, and new sites, based on the latest security threats and vulnerabilities.
  2. Thus, DB will be used to store all frequently visited, past visited, popular and risk-potential websites and their rank records.
  3. When a user requests a website, request is sent as a POST request to Secure_Surf API, which hits the etcd DB, and returns the response of whether the website requested is secure or not.
  4. If the website requested is insecure, our system will warn the user in real-time regarding insecure surfing. User is free to continue or discard the surfing session, as needed.
  5. Since multiple users can request websites at a same time, it can create load on API which in turn can cause server downtime. This can be easily overcome by using Load Balancer, which evenly distributes the load using horizontal scaling.
  6. This product will also use additional firewall used to allow/block content, control both incoming and outgoing traffic, restricting insecure webpages, alert about accessing http content and allowing https content.

Phishing emails

  1. Phishing is a type of online scam where users can get a number of emails which can be harmful and used to steal personal information ,spread malware & infections and even control financial losses.
  2. This product can help to identify such scams by performing sophisticated techniques and flagging such emails to the users.
  3. The most challenging part in this design is to continuously update the DB , used to store information about whether the sender (of email) is a cyber criminal or not.
  1. The proposed design in email phishing design covers both the important aspects of email-filtering:-
  • Email Filtering based on Sender’s info and domain
  • Email Filtering based on its contents
  1. Whenever a new email event is triggered, system will automatically check the sender’s address and domain. If the sender is blacklisted in system’s DB, then email will be flagged as harmful.
  2. The product will maintain a DB which will contain the information of all the blacklisted and whitelisted sender information. This information will be stored in key-value pair for fast searching and the DB will be continuously updated with the help of many datasets and upcoming new email scams.
  3. If the sender’s email information isn’t stored in the DB, the email will be further processed by using optimised machine learning techniques and classification algorithms.
  4. In the training module, header of email content will be pre-processed and meaningless stop words will be removed.Then, each training email will be checked to capture the values of all necessary pre-defined attributes such as-spam keywords found in sender’s name/address/title, email size, and attachments type.
  5. In the proposed design, “Keyword Database” will be built in advance which will contain two types of information-
  • Spam Keyword Table: This table will record those suspicious keywords that are found frequently in spam emails
  • Legitimate Keyword Table: This table records keywords commonly found in legitimate emails and are seldom discovered in spams
  1. It will subsequently maintain a Rule Engine DB that will store all the pre-defined rules, and values of decision tree based on the attributes.
  2. This Rule Engine DB will be used in scoring the attributes of any new incoming email and deciding whether it’s spam or legitimate.
  3. Based on the evaluation, email will be flagged as spam or legitimate and the etcd DB will be updated based on the results.
  4. Looking into the complexity of this design, we can use AWS SageMaker for speeding the throughput and reducing the latency of filtering algorithm used and getting real-time inferences.
  5. All the DB can be stored in Amazon S3 with sufficient replicas sets, so at any point of time we don’t face database server downtime issues.

Phishing via other channels

  1. Phishing is a cybercrime in which a target or targets are contacted by email, telephone or text message by someone posing as a legitimate institution to lure individuals into providing sensitive data such as personally identifiable information, banking and credit card details, and passwords.
  2. This design proposes a solution to phishing attacks mainly through sms/instant messaging apps and calls by cyber criminals.
  3. It can help the users to identify such attacks and flag them as suspicious by a number of sophisticated techniques.
  4. It can also help in providing real-time legitimacy of any suspicious phone call to the user and the user can post any suspicious messages in the app to get a professional opinion.
  1. This design focusses on phishing via other channels such as sms/instant messaging app , and incoming calls etc.
  2. If there’s a call or SMS event trigger, then system will first identify if that number is already stored on user’s mobile through a simple query service.
  3. If the number is unsaved, system will search this number through hashing in already black listed numbers DB to flag it as suspicious, if it is found in DB.
  4. If the number isn’t in DB, it will apply pre-trained SMS filtering algorithm to identify if this is a phishing /spam message. This algorithm includes:-
  • Pre-processing of the message
  • Feature extraction using Tf-ID vectoriser and Naive Bayes algorithm, and
  • machine learning algorithm, so as to classify the messages as spam or not spam.

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

What is CoAP Protocol | CoAP Protocol Introduction | Overview

January 29, 2017 at 12:15PM

AI Powered Engineering Management

Reading environment variables in a terraform file

Fun Animations in Flutter

Carousel Collection View Layout

Why did I stop learning programming languages…..

Create an Advanced ZIP Archive in Go

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Satyam Govila

Satyam Govila

More from Medium

Migrate Bank from one Cloud to another

What’s next in the world of technology? Let’s talk about all the hype around distributed cloud!

How I won Cloud Native Hackathon 2021

A hackathon picture

To Open Source or Not To Open Source? That Is the Question