

Education
Beginner’s Guide to Machine Learning and Deep Learning in 2023 – styxor.com
Published
4 months agoon








Introduction
Learning is the acquisition and mastery of knowledge over a domain through experience. It is not only a human thing but appertains to machines too. The world of computing has transformed drastically from an ineffectual mechanical system into a Herculean automated technique with the advent of Artificial Intelligence. Data is the fuel that drives this technology; the recent availability of enormous amounts of data has made it the buzzword in technology. Artificial Intelligence, in its simplest form, is to simulate human intelligence into machines for better decision-making.
Artificial intelligence (AI) is a branch of computer science that deals with the simulation of human intelligence processes by machines. The term cognitive computing is also used to refer to AI as computer models are deployed to simulate the human thinking process. Any device which recognizes its current environment and optimizes its goal is said to be AI enabled. AI could be broadly categorized as weak or strong. The systems that are designed and trained to perform a particular task are known as weak AI, like the voice activated systems. They can answer a question or obey a program command, but cannot work without human intervention. Strong AI is a generalized human cognitive ability. It can solve tasks and find solutions without human intervention. Self driving cars are an example of strong AI which uses Computer Vision, Image Recognition and Deep Learning to pilot a vehicle. AI has made its entry into a variety of industries that benefit both businesses and consumers. Healthcare, education, finance, law and manufacturing are a few of them. Many technologies like Automation, Machine learning, Machine Vision, Natural Language Processing and Robotics incorporate AI.
The drastic increase in the routine work carried out by humans’ calls for the need to automation. Precision and accuracy are the next driving terms that demand the invention of intelligent system in contrasted to the manual systems. Decision making and pattern recognition are the compelling tasks that insist on automation as they require unbiased decisive results which could be acquired through intense learning on the historic data of the concerned domain. This could be achieved through Machine Learning, where it is required of the system that makes predictions to undergo massive training on the past data to make accurate predictions in the future. Some of the popular applications of ML in daily life include commute time estimations by providing faster routes, estimating the optimal routes and the price per trip. Its application can be seen in email intelligence performing spam filters, email classifications and making smart replies. In the area of banking and personal finance it is used to make credit decisions, prevention of fraudulent transactions. It plays a major role in healthcare and diagnosis, social networking and personal assistants like Siri and Cortana. The list is almost endless and keeps growing everyday as more and more fields are employing AI and ML for their daily activities.
True artificial intelligence is decades away, but we have a type of AI called Machine Learning today. AI also known as cognitive computing is forked into two cognate techniques, the Machine Learning and the Deep Learning. Machine learning has occupied a considerable space in the research of making brilliant and automated machines. They can recognize patterns in data without being programmed explicitly. Machine learning provides the tools and technologies to learn from the data and more importantly from the changes in the data. Machine learning algorithms have found its place in many applications; from the apps that decide the food you choose to the ones that decides on your next movie to watch including the chat bots that book your saloon appointments are a few of those stunning Machine Learning applications that rock the information technology industry. Its counterpart the Deep Learning technique has its functionality inspired from the human brain cells and is gaining more popularity. Deep learning is a subset of machine learning which learns in an incremental fashion moving from the low level categories to the high level categories. Deep Learning algorithms provide more accurate results when they are trained with very large amounts of data. Problems are solved using an end to end fashion which gives them the name as magic box / black box.. Their performances are optimized with the use of higher end machines. Deep Learning has its functionality inspired from the human brain cells and is gaining more popularity. Deep learning is actually a subset of machine learning which learns in an incremental fashion moving from the low level categories to the high level categories. Deep Learning is preferred in applications such as self driving cars, pixel restorations and natural language processing. These applications simply blow our minds but the reality is that the absolute powers of these technologies are yet to be divulged. This article provides an overview of these technologies encapsulating the theory behind them along with their applications.
What is Machine Learning?
Computers can do only what they are programmed to do. This was the story of the past until computers can perform operations and make decisions like human beings. Machine Learning, which is a subset of AI is the technique that enables computers to mimic human beings. The term Machine Learning was invented by Arthur Samuel in the year 1952, when he designed the first computer program that could learn as it executed. Arthur Samuel was a pioneer of in two most sought after fields, artificial intelligence and computer gaming. According to him Machine Learning is the “Field of study that gives computers the capability to learn without being explicitly programmed”.
In ordinary terms, Machine Learning is a subset of Artificial Intelligence that allows a software to learn by itself from the past experience and use that knowledge to improve their performance in the future works without being programmed explicitly. Consider an example to identify the different flowers based on different attributes like color, shape, smell, petal size etc., In traditional programming all the tasks are hardcoded with some rules to be followed in the identification process. In machine learning this task could be accomplished easily by making the machine learn without being programmed. Machines learn from the data provided to them. Data is the fuel which drives the learning process. Though the term Machine learning was introduced way back in 1959, the fuel that drives this technology is available only now. Machine learning requires huge data and computational power which was once a dream is now at our disposal.
Traditional programming Vs Machine Learning:
When computers are employed to perform some tasks instead of human beings, they require to be provided with some instructions called a computer program. Traditional programming has been in practice for more than a century. They started in the mid 1800s where a computer program uses the data and runs on a computer system to generate the output. For example, a traditionally programmed business analysis will take the business data and the rules (computer program) as input and will output the business insights by applying the rules to the data.

On the contrary, in Machine learning the data and the outputs also called labels are provided as the input to an algorithm which comes up with a model, as an output.
For example, if the customer demographics and transactions are fed as input data and use the past customer churn rates as the output data (labels), an algorithm will be able to construct a model that can predict whether a customer will churn or not. That model is called as a predictive model. Such machine learning models could be used to predict any situation being provided with the necessary historic data. Machine learning techniques are very valuable ones because they allow the computers to learn new rules in a high dimensional complex space, which are harder to comprehend by the humans.
Need for Machine Learning:
Machine learning has been around for a while now, but the ability to apply mathematical calculations automatically and quickly to huge data is now gaining momentum. Machine Learning can be used to automate many tasks, specifically the ones that can be performed only by humans with their inbred intelligence. This intelligence can be replicated to machines through machine learning.
Machine learning has found its place in applications like the self-driving cars, online recommendation engines like friend recommendations on Facebook and offer suggestions from Amazon, and in detecting cyber frauds. Machine learning is needed for problem like image and speech recognition, language translation and sales forecasting, where we cannot write down the fixed rules to be followed for the problem.
Operations such as decision making, forecasting, making prediction, providing alerts on deviations, uncovering hidden trends or relationships require diverse, lots of unstructured and real time data from various artifacts that could be best handled only by machine learning paradigm.
History of Machine Learning
This section discusses about the development of machine learning over the years. Today we are witnessing some astounding applications like self driving cars, natural language processing and facial recognition systems making use of ML techniques for their processing. All this began in the year 1943, when Warren McCulloch a neurophysiologist along with a mathematician named Walter Pitts authored a paper which threw a light on neurons and its working. They created a model with electrical circuits and thus neural network was born.
The famous “Turing Test” was created in 1950 by Alan Turing which would ascertain whether the computers had real intelligence. It has to make a human believe that it is not a computer but a human instead, to get through the test. Arthur Samuel developed the first computer program that could learn as it played the game of checkers in the year 1952. The first neural network called the perceptron was designed by Frank Rosenblatt in the year 1957.
The big shift happened in the 1990s where machine learning moved from being knowledge driven to a data driven technique due to the availability of the huge volumes of data. IBM’s Deep Blue, developed in 1997 was the first machine to defeat the world champion in the game of chess. Businesses have recognized that the potential for complex calculations could be increased through machine learning. Some of the latest projects include: Google Brain that was developed in 2012, was a deep neural network that focused on pattern recognition in images and videos. It was later employed to detect objects in You Tube videos. In 2014, Face book created Deep Face which can recognize people just like how humans do. In 2014, Deep Mind, created a computer program called Alpha Go a board game that defeated a professional Go player. Due to its complexity the game is said to be a very challenging, yet a classical game for artificial intelligence. Scientists Stephen Hawking and Stuart Russel have felt that if AI gains the power to redesign itself with an intensifying rate, then an unbeatable “intelligence explosion” may lead to human extinction. Musk characterizes AI as humanity’s “biggest existential threat.” Open AI is an organization created by Elon Musk in 2015 to develop safe and friendly AI that could benefit humanity. Recently, some of the breakthrough areas in AI are Computer Vision, Natural Language Processing and Reinforcement Learning.
Features of Machine Learning
In recent years technology domain has witnessed an immensely popular topic called Machine Learning. Almost every business is attempting to embrace this technology. Companies have transformed the way in which they carryout business and the future seems brighter and promising due to the impact of machine learning. Some of the key features of machine learning may include:
Automation: The capacity to automate repetitive tasks and hence increase the business productivity is the biggest key factor of machine learning. ML powered paperwork and email automation are being used by many organizations. In the financial sector ML makes the accounting work faster, accurate and draws useful insights quickly and easily. Email classification is a classic example of automation, where spam emails are automatically classified by Gmail into the spam folder.
Improved customer engagement: Providing a customized experience for customers and providing excellent service are very important for any business to promote their brand loyalty and to retain long – standing customer relationships. These could be achieved through ML. Creating recommendation engines that are tailored perfectly to the customer’s needs and creating chat bots which could simulate human conversations smoothly by understanding the nuances of conversations and answer questions appropriately. An AVA of Air Asia airline is an example of one such chat bots. It is a virtual assistant that is powered by AI and responds to customer queries instantly. It can mimic 11 human languages and makes use of natural language understanding technique.
Automated data visualization: We are aware that vast data is being generated by businesses, machines and individuals. Businesses generate data from transactions, e-commerce, medical records, financial systems etc. Machines also generate huge amounts of data from satellites, sensors, cameras, computer log files, IoT systems, cameras etc. Individuals generate huge data from social networks, emails, blogs, Internet etc. The relationships between the data could be identified easily through visualizations. Identifying patterns and trends in data could be easily done easily through a visual summary of information rather than going through thousands of rows on a spreadsheet. Businesses can acquire valuable new insights through data visualizations in-order to increase productivity in their domain through user-friendly automated data visualization platforms provided by machine learning applications. Auto Viz is one such platform that provides automated data visualization tolls to enhance productivity in businesses.
Accurate data analysis: The purpose of data analysis is to find answers to specific questions that try to identify business analytics and business intelligence. Traditional data analysis involves a lot of trial and error methods, which become absolutely impossible when working with large amounts of both structured and unstructured data. Data analysis is a very important task which requires huge amounts of time. Machine learning comes in handy by offering many algorithms and data driven models that can perfectly handle real time data.
Business intelligence: Business intelligence refers to streamlined operations of collecting; processing and analyzing of data in an organization .Business intelligence applications when powered by AI can scrutinize new data and recognize the patterns and trends that are relevant to the organization. When machine learning features are combined with big data analytics it could help businesses to find solutions to the problems that will help the businesses to grow and make more profit. ML has become one of the most powerful technologies to increase business operations from e-commerce to financial sector to healthcare.
Languages for Machine Learning
There are many programming languages out there for machine learning. The choice of the language and the level of programming desired depend on how machine learning is used in an application. The fundamentals of programming, logic, data structures, algorithms and memory management are needed to implement machine learning techniques for any business applications. With this knowledge one can straight away implement machine learning models with the help of the various built-in libraries offered by many programming languages. There are also many graphical and scripting languages like Orange, Big ML, Weka and others allows to implement ML algorithms without being hardcoded; all that you require is just a fundamental knowledge about programming.
There is no single programming language that could be called as the ‘best’ for machine learning. Each of them is good where they are applied. Some may prefer to use Python for NLP applications, while others may prefer R or Python for sentiment analysis application and some use Java for ML applications relating to security and threat detection. Five different languages that are best suited for ML programming is listed below.

Python:
Nearly 8. 2 million developers are using Python for coding around the world. The annual ranking by the IEEE Spectrum, Python was chosen as the most popular programming language. It also seen that the Stack overflow trends in programming languages show that Python is rising for the past five years. It has an extensive collection of packages and libraries for Machine Learning. Any user with the basic knowledge of Python programming can use these libraries right away without much difficulty.
To work with text data, packages like NLTK, SciKit and Numpy comes handy. OpenCV and Sci-Kit image can be used to process images. One can use Librosa while working with audio data. In implementing deep learning applications, TensorFlow, Keras and PyTorch come in as a life saver. Sci-Kit-learn can be used for implementing primitive machine learning algorithms and Sci-Py for performing scientific calculations. Packages like Matplotlib, Sci-Kit and Seaborn are best suited for best data visualizations.
R:
R is an excellent programming language for machine learning applications using statistical data. R is packed with a variety of tools to train and evaluate machine learning models to make accurate future predictions. R is an open source programming language and very cost effective. It is highly flexible and cross-platform compatible. It has a broad spectrum of techniques for data sampling, data analysis, model evaluation and data visualization operations. The comprehensive list of packages include MICE which is used for handling missing values, CARET to perform classification an regression problems, PARTY and rpart to create partitions in data, random FOREST for crating decision trees, tidyr and dplyr are used for data manipulation, ggplot for creating data visualizations, Rmarkdown and Shiny to perceive insights through the creation of reports.
Java and JavaScript:
Java is picking up more attention in machine learning from the engineers who come from java background. Most of the open source tools like Hadoop and Spark that are used for big data processing are written in Java. It has a variety of third party libraries like JavaML to implement machine learning algorithms. Arbiter Java is used for hyper parameter tuning in ML. The others are Deeplearning4J and Neuroph which are used in deep learning applications. Scalability of Java is a great lift to ML algorithms which enables the creation of complex and huge applications. Java virtual machines are an added advantage to create code on multiple platforms.
Julia:
Julia is a general purpose programming language that is capable of performing complex numerical analysis and computational science. It is specifically designed to perform mathematical and scientific operations in machine learning algorithms. Julia code is executed at high speed and does not require any optimization techniques to address problems relating to performance. Has a variety of tools like TensorFlow, MLBase.jl, Flux.jl, SciKitlearn.jl. It supports all types of hardware including TPU’s and GPU’s. Tech giants like Apple and Oracle are emplying Julia for their machine learning applications.
Lisp:
LIST (List Processing) is the second oldest programming language which is being used still. It was developed for AI-centric applications. LISP is used in inductive logic programming and machine learning. ELIZA, the first AI chat bot was developed using LISP. Many machine learning applications like chatbots eCommerce are developed using LISP. It provides quick prototyping capabilities, does automatic garbage collection, offers dynamic object creation and provides lot of flexibility in operations.
Types of Machine Learning
At a high-level machine learning is defined as the study of teaching a computer program or an algorithm to automatically improve on a specific task. From the research point, it can be viewed through the eye of theoretical and mathematical modeling, about the working of the entire process. It is interesting to learn and understand about the different types of machine learning in a world that is drenched in artificial intelligence and machine learning. From the perspective of a computer user, this can be seen as the understanding of the types of machine learning and how they may reveal themselves in various applications. And from the practitioner’s perspective it is necessary to know the types of machine learning for creating these applications for any given task.

Supervised Learning:
Supervised learning is the class of problems that uses a model to learn the mapping between the input variables and the target variable. Applications consisting of the training data describing the various input variables and the target variable are known as supervised learning tasks.
Let the set of input variable be (x) and the target variable be (y). A supervised learning algorithm tries to learn a hypothetical function which is a mapping given by the expression y=f(x), which is a function of x.
The learning process here is monitored or supervised. Since we already know the output the algorithm is corrected each time it makes a prediction, to optimize the results. Models are fit on training data which consists of both the input and the output variable and then it is used to make predictions on test data. Only the inputs are provided during the test phase and the outputs produced by the model are compared with the kept back target variables and is used to estimate the performance of the model.
There are basically two types of supervised problems: Classification – which involves prediction of a class label and Regression – that involves the prediction of a numerical value.
The MINST handwritten digits data set can be seen as an example of classification task. The inputs are the images of handwritten digits, and the output is a class label which identifies the digits in the range 0 to 9 into different classes.
The Boston house price data set could be seen as an example of Regression problem where the inputs are the features of the house, and the output is the price of a house in dollars, which is a numerical value.
Unsupervised Learning:
In an unsupervised learning problem the model tries to learn by itself and recognize patterns and extract the relationships among the data. As in case of a supervised learning there is no supervisor or a teacher to drive the model. Unsupervised learning operates only on the input variables. There are no target variables to guide the learning process. The goal here is to interpret the underlying patterns in the data in order to obtain more proficiency over the underlying data.
There are two main categories in unsupervised learning; they are clustering – where the task is to find out the different groups in the data. And the next is Density Estimation – which tries to consolidate the distribution of data. These operations are performed to understand the patterns in the data. Visualization and Projection may also be considered as unsupervised as they try to provide more insight into the data. Visualization involves creating plots and graphs on the data and Projection is involved with the dimensionality reduction of the data.
Reinforcement Learning:
Reinforcement learning is type a of problem where there is an agent and the agent is operating in an environment based on the feedback or reward given to the agent by the environment in which it is operating. The rewards could be either positive or negative. The agent then proceeds in the environment based on the rewards gained.
The reinforcement agent determines the steps to perform a particular task. There is no fixed training dataset here and the machine learns on its own.
Playing a game is a classic example of a reinforcement problem, where the agent’s goal is to acquire a high score. It makes the successive moves in the game based on the feedback given by the environment which may be in terms of rewards or a penalization. Reinforcement learning has shown tremendous results in Google’s AplhaGo of Google which defeated the world’s number one Go player.
Machine Learning Algorithms
There are a variety of machine learning algorithms available and it is very difficult and time consuming to select the most appropriate one for the problem at hand. These algorithms can be grouped in to two categories. Firstly, they can be grouped based on their learning pattern and secondly by their similarity in their function.
Based on their learning style they can be divided into three types:
- Supervised Learning Algorithms: The training data is provided along with the label which guides the training process. The model is trained until the desired level of accuracy is attained with the training data. Examples of such problems are classification and regression. Examples of algorithms used include Logistic Regression, Nearest Neighbor, Naive Bayes, Decision Trees, Linear Regression, Support Vector Machines (SVM), Neural Networks.
- Unsupervised Learning Algorithms: Input data is not labeled and does not come with a label. The model is prepared by identifying the patterns present in the input data. Examples of such problems include clustering, dimensionality reduction and association rule learning. List of algorithms used for these type of problems include Apriori algorithm and K-Means and Association Rules
- Semi-Supervised Learning Algorithms: The cost to label the data is quite expensive as it requires the knowledge of skilled human experts. The input data is combination of both labeled and unlabelled data. The model makes the predictions by learning the underlying patterns on their own. It is a mix of both classification and clustering problems.
Based on the similarity of function the algorithms can be grouped into the following:
- Regression Algorithms: Regression is a process that is concerned with identifying the relationship between the target output variables and the input features to make predictions about the new data. Top six Regression algorithms are: Simple Linear Regression, Lasso Regression, Logistic regression, Multivariate Regression algorithm, Multiple Regression Algorithm.
- Instance based Algorithms: These belong to the family of learning that measures new instances of the problem with those in the training data to find out a best match and makes a prediction accordingly. The top instance based algorithms are: k-Nearest Neighbor, Learning Vector Quantization, Self-Organizing Map, Locally Weighted Learning, and Support Vector Machines.
- Regularization: Regularization refers to the technique of regularizing the learning process from a particular set of features. It normalizes and moderates. The weights attached to the features are normalized which prevents in certain features dominating the prediction process. This technique helps to prevent the problem of overfitting in machine learning. The various regularization algorithms are Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO) and Least-Angle Regression (LARS).
- Decision Tree Algorithms: These methods construct tree based model constructed on the decisions made by examining the values of the attributes. Decision trees are used for both classification and regression problems. Some of the well known decision tree algorithms are: Classification and Regression Tree, C4.5 and C5.0, Conditional Decision Trees, Chi-squared Automatic Interaction Detection and Decision Stump.
- Bayesian Algorithms: These algorithms apply the Bayes theorem for the classification and regression problems. They include Naive Bayes, Gaussian Naive Bayes, Multinomial Naive Bayes, Bayesian Belief Network, Bayesian Network and Averaged One-Dependence Estimators.
- Clustering Algorithms: Clustering algorithms involves the grouping of data points into clusters. All the data points that are in the same group share similar properties and, data points in different groups have highly dissimilar properties. Clustering is an unsupervised learning approach and is mostly used for statistical data analysis in many fields. Algorithms like k-Means, k-Medians, Expectation Maximisation, Hierarchical Clustering, Density-Based Spatial Clustering of Applications with Noise fall under this category.
- Association Rule Learning Algorithms: Association rule learning is a rule-based learning method for identifying the relationships between variables in a very large dataset. Association Rule learning is employed predominantly in market basket analysis. The most popular algorithms are: Apriori algorithm and Eclat algorithm.
- Artificial Neural Network Algorithms: Artificial neural network algorithms relies find its base from the biological neurons in the human brain. They belong to the class of complex pattern matching and prediction process in classification and regression problems. Some of the popular artificial neural network algorithms are: Perceptron, Multilayer Perceptrons, Stochastic Gradient Descent, Back-Propagation, , Hopfield Network, and Radial Basis Function Network.
- Deep Learning Algorithms: These are modernized versions of artificial neural network, that can handle very large and complex databases of labeled data. Deep learning algorithms are tailored to handle text, image, audio and video data. Deep learning uses self-taught learning constructs with many hidden layers, to handle big data and provides more powerful computational resources. The most popular deep learning algorithms are: Some of the popular deep learning ms include Convolutional Neural Network, Recurrent Neural Networks, Deep Boltzmann Machine, Auto-Encoders Deep Belief Networks and Long Short-Term Memory Networks.
- Dimensionality Reduction Algorithms: Dimensionality Reduction algorithms exploit the intrinsic structure of data in an unsupervised manner to express data using reduced information set. They convert a high dimensional data into a lower dimension which could be used in supervised learning methods like classification and regression. Some of the well known dimensionality reduction algorithms include Principal Component Analysis, Principal Component Regressio, Linear Discriminant Analysis, Quadratic Discriminant Analysis, Mixture Discriminant Analysis, Flexible Discriminant Analysis and Sammon Mapping.
- Ensemble Algorithms: Ensemble methods are models made up of various weaker models that are trained separately and the individual predictions of the models are combined using some method to get the final overall prediction. The quality of the output depends on the method chosen to combine the individual results. Some of the popular methods are: Random Forest, Boosting, Bootstrapped Aggregation, AdaBoost, Stacked Generalization, Gradient Boosting Machines, Gradient Boosted Regression Trees and Weighted Average.
Machine Learning Life Cycle
Machine learning gives the ability to computers to learn automatically without having the need to program them explicitly. The machine learning process comprises of several stages to design, develop and deploy high quality models. Machine Learning Life Cycle comprises of the following steps
- Data collection
- Data Preparation
- Data Wrangling
- Data Analysis
- Model Training
- Model Testing
- Deployment of the Model

- Data Collection: This is the very first step in creating a machine learning model. The main purpose of this step is to identify and gather all the data that are relevant to the problem. Data could be collected from various sources like files, database, internet, IoT devices, and the list is ever growing. The efficiency of the output will depend directly on the quality of data gathered. So utmost care should be taken in gathering large volume of quality data.
- Data Preparation: The collected data are organized and put in one place or further processing. Data exploration is a part of this step, where the characteristics, nature, format and the quality of the data are being accessed. This includes creating pie charts, bar charts, histogram, skewness etc. data exploration provides useful insight on the data and is helpful in solving of 75% of the problem.
- Data Wrangling: In Data Wrangling the raw data is cleaned and converted into a useful format. The common technique applied to make the most out of the collected data are:
- Missing value check and missing value imputation
- Removing unwanted data and Null values
- Optimizing the data based on the domain of interest
- Detecting and removing outliers
- Reducing the dimension of the data
- Balancing the data, Under-Sampling and Over-Sampling.
- Removal of duplicate records
- Data Analysis: This step is concerned with the feature selection and model selection process. The predictive power of the independent variables in relation to the dependent variable is estimated. Only those variables that are beneficial to the model is selected. Next the appropriate machine learning technique like classification, regression, clustering, association, etc is selected and the model is built using the data.
- Model Training: Training is a very important step in machine learning, as the model tries to understand the various patterns, features and the rules from the underlying data. Data is split into training data and testing data. The model is trained on the training data until its performance reaches an acceptable level.
- Model Testing: After training the model it is put under testing to evaluate its performance on the unseen test data. The accuracy of prediction and the performance of the model can be measured using various measures like confusion matrix, precision and recall, Sensitivity and specificity, Area under the curve, F1 score, R square, gini values etc.
- Deployment: This is the final step in the machine learning life cycle, and we deploy the model constructed in the real world system. Before deployment the model is pickled that is it has to be converted into a platform independent executable form. The pickled model can be deployed using Rest API or Micro-Services.
Deep Learning
Deep learning is a subset of machine learning that follows the functionality of the neurons in the human brain. The deep learning network is made up of multiple neurons interconnected with each other in layers. The neural network has many deep layers that enable the learning process. The deep learning neural network is made up of an input layer, an output layer and multiple hidden layers that make up the complete network. The processing happens through the connections that contain the input data, the pre-assigned weights and the activation function which decides the path for the flow of control through the network. The network operates on huge volume of data and propagates them thorough each layer by learning complex features at each level. If the outcome of the model is not as expected then the weights are adjusted and the process repeats again until the desire outcome is achieved.

Deep neural network can learn the features automatically without being programmed explicitly. Each layer depicts a deeper level of information. The deep learning model follows a hierarchy of knowledge represented in each of the layers. A neural network with five layers will learn more than a neural network with three layers. The learning in a neural network occurs in two steps. In the first step, a nonlinear transformation is applied to the input and a statistical model is created. During the second step, the created model is improved with the help of a mathematical model called as derivative. These two steps are repeated by the neural network thousands of times until it reaches the desired level of accuracy. The repetition of these two steps is known as iteration.
The neural network that has only one hidden layer is known as a shallow network and the neural network that has more than one hidden layers is known as deep neural network.
Types of neural networks:
There are different types of neural networks available for different types of processes. The most commonly used types are discussed here.
- Perceptron: The perceptron is a single-layered neural network that contains only an input layer and an output layer. There are no hidden layers. The activation function used here is the sigmoid function.
- Feed forward: The feed forward neural network is the simplest form of neural network where the information flows only in one direction. There are no cycles in the path of the neural network. Every node in a layer is connected to all the nodes in the next layer. So all the nodes are fully connected and there are no back loops.

- Recurrent Neural Networks: Recurrent Neural Networks saves the output of the network in its memory and feeds it back to the network to help in the prediction of the output. The network is made up of two different layers. The first is a feed forward neural network and the second is a recurrent neural network where the previous network values and states are remembered in a memory. If a wrong prediction is made then the learning rate is used to gradually move towards making the correct prediction through back propagation.
- Convolutional Neural Network: Convolutional Neural Networks are used where it is necessary to extract useful information from unstructured data. Propagation of signa is uni-directional in a CNN. The first layer is convolutional layer which is followed by a pooling, followed by multiple convolutional and pooling layers. The output of these layers is fed into a fully connected layer and a softmax that performs the classification process. The neurons in a CNN have learnable weights and biases. Convolution uses the nonlinear RELU activation function. CNNs are used in signal and image processing applications.

- Reinforcement Learning: In reinforcement learning the agent that operates in a complex and uncertain environment learns by a trial and error method. The agent is rewarded or punished virtually as a result of its actions, and helps in refining the output produced. The goal is to maximize the total number of rewards received by the agent. The model learns on its own to maximize the rewards. Google’s DeepMind and Self drivig cars are examples of applications where reinforcement learning is leveraged.
Difference Between Machine Learning And Deep Learning
Deep learning is a subset of machine learning. The machine learning models become better progressively as they learn their functions with some guidance. If the predictions are not correct then an expert has to make the adjustments to the model. In deep learning the model itself is capable of identifying whether the predictions are correct or not.
- Functioning: Deep learning takes the data as the input and tries to make intelligent decisions automatically using the staked layers of artificial neural network. Machine learning takes the input data, parses it and gets trained on the data. It tries to make decisions on the data based on what it has learnt during the training phase.
- Feature extraction: Deep learning extracts the relevant features from the input data. It automatically extracts the features in a hierarchical manner. The features are learnt in a layer wise manner. It learns the low-level features initially and as it moves down the network it tries to learn the more specific features. Whereas machine learning models requires features that are hand-picked from the dataset. These features are provided as the input to the model to do the prediction.
- Data dependency: Deep learning models require huge volumes of data as they do the feature extraction process on their own. But a machine learning model works perfectly well with smaller datasets. The depth of the network in a deep learning model increases with the data and hence the complexity of the deep learning model also increases. The following diagram shows that the performance of the deep learning model increases with increased data, but the machine learning models flattens the curve after a certain period.
- Computational Power: Deep learning networks are highly dependent on huge data which requires the support of GPUs rather than the normal CPUs. GPUs can maximize the processing of deep learning models as they can process multiple computations at the same time. The high memory bandwidth in GPUs makes them suitable for deep learning models. On the other hand machine learning models can be implemented on CPUs.
- Execution time: Normally deep learning algorithms take a long time to train due to the large number of parameters involved. The ResNet architecture which is an example of deep learning algorithm takes almost two weeks to train from the scratch. But machine learning algorithms takes less time to train (few minutes to a few hours). This is completely reversed with respect to the testing time. Deep learning algorithms take lesser time to run.
- Interpretability: It is easier to interpret machine learning algorithms and understand what is being done at each step and why it is being done. But deep learning algorithms are known as black boxes as one really does not know what is happening on the inside of the deep learning architecture. Which neurons are activated and how much they contribute to the output. So interpretation of machine learning models is much easier than the deep learning models.

Applications of Machine Learning
- Traffic Assistants: All of us use traffic assistants when we travel. Google Maps comes in handy to give us the routes to our destination and also shows us the routes with less traffic. Everyone who uses the maps are providing their location, route taken and their speed of driving to Google maps. These details about the traffic are collected by Google Maps and it tries to predict the traffic in your route and tries to adjust your route accordingly.
- Social media: The most common application of machine learning could be seen in automatic friend tagging and friend suggestions. Facebook uses Deep Face to do Image recognition and Face detection in digital images.
- Product Recommendation: When you browse through Amazon for a particular product but do not purchase them, then the next day when you open up YouTube or Facebook then you get to see ads relating to it. Your search history is being tracked by Google and it recommends products based on your search history. This is an application of machine learning technique.
- Personal Assistants: Personal assistants help in finding useful information. The input to a personal assistant could be either through voice or text. There is no one who could say that they don’t know about Siri and Alexa. Personal assistants can help in answering phone calls, scheduling meeting, taking notes, sending emails, etc.
- Sentiment Analysis: It is a real time machine learning application that can understand the opinion of people. Its application can be viewed in review based websites and in decision making applications.
- Language Translation: Translating languages is no more a difficult task as there is a hand full of language translators available now. Google’s GNMT is an efficient neural machine translation tool that can access thousands of dictionaries and languages to provide an accurate translation of sentences or words using the Natural Language Processing technology.
- Online Fraud Detection: ML algorithms can learn from historical fraud patterns and recognize fraud transaction in the future.ML algorithms have proved to be more efficient than humans in the speed of information processing. Fraud detection system powered by ML can find frauds that humans fail to detect.
- Healthcare services: AI is becoming the future of healthcare industry. AI plays a key role in clinical decision making thereby enabling early detection of diseases and to customize treatments for patients. PathAI which uses machine learning is used by pathologists to diagnose diseases accurately. Quantitative Insights is AI enabled software that improves the speed and accuracy in the diagnosis of breast cancer. It provides better results for patients through improved diagnosis by radiologists.
Applications of Deep Learning
- Self-driving cars: Autonomous driving cars are enabled by deep learning technology. Research is also being done at the Ai Labs to integrate features like food delivery into driverless cars. Data is collected from sensors, cameras and geo mapping helps to create more sophisticated models that can travel seamlessly through traffic.
- Fraud news detection: Detecting fraud news is very important in today’s world. Internet has become the source of all kinds of news both genuine and fake. Trying to identify fake news is a very difficult task. With the help of deep learning we can detect fake news and remove it from the news feeds.
- Natural Language Processing: Trying to understand the syntaxes, semantics, tones or nuances of a language is a very hard and complex task for humans. Machines could be trained to identify the nuances of a language and to frame responses accordingly with the help of Natural Language Processing technique. Deep learning is gaining popularity in applications like classifying text, twitter analysis, language modeling, sentiment analysis etc, which employs natural language processing.
- Virtual Assistants: Virtual assistants are using deep learning techniques to have an extensive knowledge about the subjects right from people’s dining out preferences to their favorite songs. Virtual assistants try to understand the languages spoken and try to carry out the tasks. Google has been working on this technology for many years called Google duplex which uses natural language understanding, deep learning and text-to–speech to help people book appointments anywhere in the middle of the week. And once the assistant is done with the job it will give you a confirmation notification that your appointment has been taken care of. The calls don’t go as expected but the assistant understands the context to nuance and handles the conversation gracefully.
- Visual Recognition: Going through old photographs could be nostalgic, but searching for a particular photo could become a tedious process as it involves sorting, and segregation which is time consuming. Deep learning can now be applied o images to sort them based on locations in the photographs, combination of peoples, according to some events or dates. Searching the photographs is no more a tedious and complex. Vision AI draws insights from images in the cloud with AutoML Vision or pretrained Vision API models to identify text, understand emotions in images.
- Coloring of Black and White images: Coloring a black and white image is like a child’s play with the help of Computer Vision algorithms that use deep learning techniques to bring about the life in the pictures by coloring them with the correct tones of color. The Colorful Image Colorization micro-services is an algorithm using computer vision technique and deep learning algorithms that are trained on the Imagenet database to color black and white images.
- Adding Sounds to Silent Movies: AI can now create realistic sound tracks for silent videos. CNNs and recurrent neural networks are employed to perform feature extraction and the prediction process. Research have shown that these algorithms that have learned to predict sound can produce better sound effects for old movies and help robots understand the objects in their surroundings.
- Image to Language Translation: This is another interesting application of deep learning. The Google translate app can automatically translate images into real time language of choice. The deep learning network reads the image and translates the text into the needed language.
- Pixel Restoration: The researchers in Google Brain have trained a Deep Learning network that takes a very low resolution image of a person faces and predicts the person’s face through it. This method is known as Pixel Recursive Super Resolution. This method enhances the resolution of photos by identifying the prominent features that is just enough for identifying the personality of the person.
Conclusion
This chapter has discovered the applications of machine learning and deep learning to give a clearer idea about the current and future capabilities of Artificial Intelligence. It is predicted that many applications of Artificial Intelligence will affect our lives in the near future. Predictive analytics and artificial intelligence are going to play a fundamental role in the future in content creation and also in the software development. Actually, the fact is they are already making an impact. Within the next few years, AI development tools, libraries, and languages will become the universally accepted standard components of every software development toolkit that you can name. The technology of artificial intelligence will become the future in all the domains including health, business, environment, public safety and security.
References
[1] Aditya Sharma(2018), “Differences Between Machine Learning & Deep Learning”
[2] Kislay Keshari(2020), “Top 10 Applications of Machine Learning : Machine Learning Applications in Daily Life”
[3] Brett Grossfeld(2020), “Deep learning vs machine learning: a simple way to understand the difference”
[4] By Nikita Duggal(2020), “Real-World Machine Learning Applications That Will Blow Your Mind”
[5] P. P. Shinde and S. Shah, “A Review of Machine Learning and Deep Learning Applications,” 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), Pune, India, 2018, pp. 1-6
[6] https://www.javatpoint.com/machine-learning-life-cycle
[7] https://medium.com/app-affairs/9-applications-of-machine-learning-from-day-to-day-life-112a47a429d0
[8] Dan Shewan(2019), “10 Companies Using Machine Learning in Cool Ways”
[9] Marina Chatterjee(2019), “Top 20 Applications of Deep Learning in 2020 Across Industries
[10] A Tour of Machine Learning Algorithms by Jason Brownlee in Machine Learning Algorithms
[11] Jaderberg, Max, et al. “Spatial Transformer Networks.” In Advances in neural information processing systems (2015): 2017-2025.
[12] Van Veen, F. & Leijnen, S. (2019). The Neural Network Zoo. Retrieved from https://www.asimovinstitute.org/neural-network-zoo
[13] Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, [pdf], 2012
[14] Yadav, Neha, Anupam, Kumar, Manoj, An Introduction to Neural Networks for Differential Equations (ISBN: 978-94-017-9815-0)
[15] Hugo Mayo, Hashan Punchihewa, Julie Emile, Jackson Morrison History of Machine Learning, 2018
[16] Pedro Domingos , 2012, Tapping into the “folk knowledge” needed to advance machine learning applications. by A Few Useful, doi:10.1145/2347736.2347755
[17] Alex Smola and S.V.N. Vishwanathan, Introduction to Machine Learning, Cambridge University Press 2008
[18] Antonio Guili and Sujit Pal, Deep Learning with Keras: Implementing deep learning models and neural networks with the power of Python, Release year: 2017; Packt Publishing Ltd.
[19] AurÈlien GÈron ,Hands-On Machine Learning with Scikit-Learn and Tensor Flow: Concepts, Tools, and Techniques to Build Intelligent Systems, Release year: 2017. O’Reilly
[20] Best language for Machine Learning: Which Programming Language to Learn, August 31, 2020, Springboard India.
You may like
-
The Benefits of Continuous Learning for Career Growth – styxor.com
-
Top 100 Salesforce Interview Questions and Answers (2023) – styxor.com
-
Top 30 Capital Market Interview Questions – styxor.com
-
Inspecting the Quality of Data and Drawing Inferences Using Machine Learning – styxor.com
-
Universities offering MS in Machine Learning in US – styxor.com
-
OOPs Concepts in Java ( Updated 2023) – styxor.com




Introduction
Label encoding is a technique used in machine learning and data analysis to convert categorical variables into numerical format. It is particularly useful when working with algorithms that require numerical input, as most machine learning models can only operate on numerical data. In this explanation, we’ll explore how label encoding works and how to implement it in Python.
Let’s consider a simple example with a dataset containing information about different types of fruits, where the “Fruit” column has categorical values such as “Apple,” “Orange,” and “Banana.” Label encoding assigns a unique numerical label to each distinct category, transforming the categorical data into numerical representation.
To perform label encoding in Python, we can use the scikit-learn library, which provides a range of preprocessing utilities, including the LabelEncoder class. Here’s a step-by-step guide:
- Import the necessary libraries:
pythonCopy codefrom sklearn.preprocessing import LabelEncoder
- Create an instance of the LabelEncoder class:
pythonCopy codelabel_encoder = LabelEncoder()
- Fit the label encoder to the categorical data:
pythonCopy codelabel_encoder.fit(categorical_data)
Here, categorical_data
refers to the column or array containing the categorical values you want to encode.
- Transform the categorical data into numerical labels:
pythonCopy codeencoded_data = label_encoder.transform(categorical_data)
The transform
method takes the original categorical data and returns an array with the corresponding numerical labels.
- If needed, you can also reverse the encoding to obtain the original categorical values using the
inverse_transform
method:
pythonCopy codeoriginal_data = label_encoder.inverse_transform(encoded_data)
Label encoding can also be applied to multiple columns or features simultaneously. You can repeat steps 3-5 for each categorical column you want to encode.
It is important to note that label encoding introduces an arbitrary order to the categorical values, which may lead to incorrect assumptions by the model. To avoid this issue, you can consider using one-hot encoding or other methods such as ordinal encoding, which provide more appropriate representations for categorical data.
Label encoding is a simple and effective way to convert categorical variables into numerical form. By using the LabelEncoder class from scikit-learn, you can easily encode your categorical data and prepare it for further analysis or input into machine learning algorithms.
Now, let us first briefly understand what data types are and its scale. It is important to know this for us to proceed with categorical variable encoding. Data can be classified into three types, namely, structured data, semi-structured, and unstructured data.
Structured data denotes that the data represented is in matrix form with rows and columns. The data can be stored in database SQL in a table, CSV with delimiter separated, or excel with rows and columns.
The data which is not in matrix form can be classified into semi-Structured data (data in XML, JSON format) or unstructured data (emails, images, log data, videos, and textual data).
Let us say, for given data science or machine learning business problem if we are dealing with only structured data and the data collected is a combination of both Categorical variables and Continuous variables, most of the machine learning algorithms will not understand, or not be able to deal with categorical variables. Meaning, that machine learning algorithms will perform better in terms of accuracy and other performance metrics when the data is represented as a number instead of categorical to a model for training and testing.
Deep learning techniques such as the Artificial Neural network expect data to be numerical. Thus, categorical data must be encoded to numbers before we can use it to fit and evaluate a model.
Few ML algorithms such as Tree-based (Decision Tree, Random Forest ) do a better job in handling categorical variables. The best practice in any data science project is to transform categorical data into a numeric value.
Now, our objective is clear. Before building any statistical models, machine learning, or deep learning models, we need to transform or encode categorical data to numeric values. Before we get there, we will understand different types of categorical data as below.
Nominal Scale
The nominal scale refers to variables that are just named and are used for labeling variables. Note that all of A nominal scale refers to variables that are names. They are used for labeling variables. Note that all of these scales do not overlap with each other, and none of them has any numerical significance.
Below are the examples that are shown for nominal scale data. Once the data is collected, we should usually assign a numerical code to represent a nominal variable.

For example, we can assign a numerical code 1 to represent Bangalore, 2 for Delhi, 3 for Mumbai, and 4 for Chennai for a categorical variable- in which place do you live. Important to note that the numerical value assigned does not have any mathematical value attached to them. Meaning, that basic mathematical operations such as addition, subtraction, multiplication, or division are pointless. Bangalore + Delhi or Mumbai/Chennai does not make any sense.
Ordinal Scale
An Ordinal scale is a variable in which the value of the data is captured from an ordered set. For example, customer feedback survey data uses a Likert scale that is finite, as shown below.

In this case, let’s say the feedback data is collected using a five-point Likert scale. The numerical code 1, is assigned to Poor, 2 for Fair, 3 for Good, 4 for Very Good, and 5 for Excellent. We can observe that 5 is better than 4, and 5 is much better than 3. But if you look at excellent minus good, it is meaningless.
We very well know that most machine learning algorithms work exclusively with numeric data. That is why we need to encode categorical features into a representation compatible with the models. Hence, we will cover some popular encoding approaches:
- Label encoding
- One-hot encoding
- Ordinal Encoding
Label Encoding
In label encoding in Python, we replace the categorical value with a numeric value between 0 and the number of classes minus 1. If the categorical variable value contains 5 distinct classes, we use (0, 1, 2, 3, and 4).
To understand label encoding with an example, let us take COVID-19 cases in India across states. If we observe the below data frame, the State column contains a categorical value that is not very machine-friendly and the rest of the columns contain a numerical value. Let us perform Label encoding for State Column.
From the below image, after label encoding, the numeric value is assigned to each of the categorical values. You might be wondering why the numbering is not in sequence (Top-Down), and the answer is that the numbering is assigned in alphabetical order. Delhi is assigned 0 followed by Gujarat as 1 and so on.

Label Encoding using Python
- Before we proceed with label encoding in Python, let us import important data science libraries such as pandas and NumPy.
- Then, with the help of panda, we will read the Covid19_India data file which is in CSV format and check if the data file is loaded properly. With the help of info(). We can notice that a state datatype is an object. Now we can proceed with LabelEncoding.
Label Encoding can be performed in 2 ways namely:
- LabelEncoder class using scikit-learn library
- Category codes
Approach 1 – scikit-learn library approach
As Label Encoding in Python is part of data preprocessing, hence we will take an help of preprocessing module from sklearn package and import LabelEncoder class as below:
And then:
- Create an instance of LabelEncoder() and store it in labelencoder variable/object
- Apply fit and transform which does the trick to assign numerical value to categorical value and the same is stored in new column called “State_N”
- Note that we have added a new column called “State_N” which contains numerical value associated to categorical value and still the column called State is present in the dataframe. This column needs to be removed before we feed the final preprocess data to machine learning model to learn
Approach 2 – Category Codes
- As you had already observed that “State” column datatype is an object type which is by default hence, need to convert “State” to a category type with the help of pandas
- We can access the codes of the categories by running covid19[“State].cat.codes
One potential issue with label encoding is that most of the time, there is no relationship of any kind between categories, while label encoding introduces a relationship.
In the above six classes’ example for “State” column, the relationship looks as follows: 0 < 1 < 2 < 3 < 4 < 5. It means that numeric values can be misjudged by algorithms as having some sort of order in them. This does not make much sense if the categories are, for example, States.
Also Read: 5 common errors to avoid while working with ML
There is no such relation in the original data with the actual State names, but, by using numerical values as we did, a number-related connection between the encoded data might be made. To overcome this problem, we can use one-hot encoding as explained below.
One-Hot Encoding
In this approach, for each category of a feature, we create a new column (sometimes called a dummy variable) with binary encoding (0 or 1) to denote whether a particular row belongs to this category.
Let us consider the previous State column, and from the below image, we can notice that new columns are created starting from state name Maharashtra till Uttar Pradesh, and there are 6 new columns created. 1 is assigned to a particular row that belongs to this category, and 0 is assigned to the rest of the row that does not belong to this category.
A potential drawback of this method is a significant increase in the dimensionality of the dataset (which is called a Curse of Dimensionality).
Meaning, one-hot encoding is the fact that we are creating additional columns, one for each unique value in the set of the categorical attribute we’d like to encode. So, if we have a categorical attribute that contains, say, 1000 unique values, that one-hot encoding will generate 1,000 additional new attributes and this is not desirable.
To keep it simple, one-hot encoding is quite a powerful tool, but it is only applicable for categorical data that have a low number of unique values.
Creating dummy variables introduces a form of redundancy to the dataset. If a feature has three categories, we only need to have two dummy variables because, if an observation is neither of the two, it must be the third one. This is often referred to as the dummy-variable trap, and it is a best practice to always remove one dummy variable column (known as the reference) from such an encoding.
Data should not get into dummy variable traps that will lead to a problem known as multicollinearity. Multicollinearity occurs where there is a relationship between the independent variables, and it is a major threat to multiple linear regression and logistic regression problems.
To sum up, we should avoid label encoding in Python when it introduces false order to the data, which can, in turn, lead to incorrect conclusions. Tree-based methods (decision trees, Random Forest) can work with categorical data and label encoding. However, for algorithms such as linear regression, models calculating distance metrics between features (k-means clustering, k-Nearest Neighbors) or Artificial Neural Networks (ANN) are one-hot encoding.
One-Hot Encoding using Python
Now, let’s see how to apply one-hot encoding in Python. Getting back to our example, in Python, this process can be implemented using 2 approaches as follows:
- scikit-learn library
- Using Pandas
Approach 1 – scikit-learn library approach
- As one-hot encoding is also part of data preprocessing, hence we will take an help of preprocessing module from sklearn package and them import OneHotEncoder class as below
- Instantiate the OneHotEncoder object, note that parameter drop = ‘first’ will handle dummy variable traps
- Perform OneHotEncoding for categorical variable
4. Merge One Hot Encoded Dummy Variables to Actual data frame but do not forget to remove the actual column called “State”
5. From the below output, we can observe, dummy variable trap has been taken care
Approach 2 – Using Pandas: with the help of get_dummies function
- As we all know, one-hot encoding is such a common operation in analytics, that pandas provide a function to get the corresponding new features representing the categorical variable.
- We are considering the same dataframe called “covid19” and imported pandas library which is sufficient to perform one hot encoding
- As you notice below code, this generates a new DataFrame containing five indicator columns, because as explained earlier for modeling we don’t need one indicator variable for each category; for a categorical feature with K categories, we need only K-1 indicator variables. In our example, “State_Delhi” was removed
- In the case of 6 categories, we need only five indicator variables to preserve the information (and avoid collinearity). That is why the pd.get_dummies function has another Boolean argument, drop_first=True, which drops the first category
- Since the pd.get_dummies function generates another DataFrame, we need to concatenate (or add) the columns to our original DataFrame and also don’t forget to remove column called “State”
- Here, we use the pd.concat function, indicating with the axis=1 argument that we want to concatenate the columns of the two DataFrames given in the list (which is the first argument of pd.concat). Don’t forget to remove actual “State” column
Ordinal Encoding
An Ordinal Encoder is used to encode categorical features into an ordinal numerical value (ordered set). This approach transforms categorical value into numerical value in ordered sets.
This encoding technique appears almost similar to Label Encoding. But, label encoding would not consider whether a variable is ordinal or not, but in the case of ordinal encoding, it will assign a sequence of numerical values as per the order of data.
Let’s create a sample ordinal categorical data related to the customer feedback survey, and then we will apply the Ordinal Encoder technique. In this case, let’s say the feedback data is collected using a Likert scale in which numerical code 1 is assigned to Poor, 2 for Good, 3 for Very Good, and 4 for Excellent. If you observe, we know that 5 is better than 4, 5 is much better than 3, but taking the difference between 5 and 2 is meaningless (Excellent minus Good is meaningless).
Ordinal Encoding using Python
With the help of Pandas, we will assign customer survey data to a variable called “Customer_Rating” through a dictionary and then we can map each row for the variable as per the dictionary.
That brings us to the end of the blog on Label Encoding in Python. We hope you enjoyed this blog. Also, check out this free Python for Beginners course to learn the Fundamentals of Python. If you wish to explore more such courses and learn new concepts, join the Great Learning Academy free course today.
Education
Python Main Function and Examples with Code – styxor.com
Published
2 days agoon
May 31, 2023By

In the vast landscape of programming languages, Python is a versatile and powerful tool that has gained immense popularity among developers of all levels. Created by Guido van Rossum and first released in 1991, Python has evolved into a robust and flexible language known for its simplicity, readability, and extensive library support.
Python’s popularity can be attributed to its ease of use and ability to handle various tasks. Whether you’re a beginner taking your first steps into programming or a seasoned developer working on complex projects, Python provides an accessible and efficient environment that empowers you to bring your ideas to life.
One of Python’s greatest strengths lies in its emphasis on code readability. The language was designed with a clean and straightforward syntax, making it easy to read and understand. This readability not only makes Python a great choice for beginners learning the basics of programming but also enhances collaboration among teams, as code written in Python is often more intuitive and expressive.
Furthermore, Python’s versatility enables it to be used in various domains and industries. From web development and data analysis to artificial intelligence and machine learning, Python has established itself as a go-to language for a wide range of applications. Its extensive library ecosystem, including popular ones like NumPy, pandas, and TensorFlow, provides developers with a rich set of tools and frameworks to tackle complex problems efficiently.
With its performance, Python has earned a reputation as the most popular and demanding programming language to learn in software technology. To excel in Python, it is essential to understand and learn each aspect of the Python language. The Python main function is an essential aspect of Python.
This article will provide you deep insights about the main function in Python programming. Let’s start by understanding more about the term.
What is Python Main?
Almost all the programming languages have a special function which is known as the main function, and it executes automatically whenever the program runs. In the program syntax, it is written like “main().”
In Python, the role of the main function is to act as the starting point of execution for any software program. The execution of the program starts only when the main function is defined in Python because the program executes only when it runs directly, and if it is imported as a module, then it will not run. While writing a program, it is not necessary to define the main function every time because the Python interpreter executes from the top of the file until a specific function is defined in the program to stop it.
Examples Of Python Main With Code
To understand the main function in Python in a better way, let’s see the below-mentioned example without using the main method:
Input:
print(“How are you?”)
def main():
print(“What about you?”)
print(“I am fine”)
Output:
How are you?
I am fine
Explanation
Observing the above program closely, one can see clearly that only ‘Good Morning’ and ‘Good Evening’ are printed, and the term ‘What about you?’ is not printed. The reason for this is that the main function of Python is not being used in the program.
Now let’s see the following program with function call if __name__ == “__main__”:
Input
print(“How are you?”)
def main():
print(“What about you?”)
print(“I am fine”)
if __name__ == “__main__”:
main()
Output:
How are you?
I am fine
What about you?
Explanation
Observing the above-mentioned program, one question may arise in the mind why “What about you”? is printed. This happens because of calling the main function at the end of the code. The final output of the program reflects ‘How are you?’ first, ‘I am fine’ next, and ‘What about you?’ at the end.
What Does Python Main Do?
A main() function is defined by the user in the program, which means parameters can be passed to the main() function as per the requirements of a program. The use of a main() function is to invoke the programming code at the run time, not at the compile time of a program.
What Is _name_ In Python?
The ” __name__ ” variable (two underscores before and after) is called a special Python variable. The value it gets depends on how the containing script is executed. Sometimes a script written with functions might be useful in other scripts as well. In Python, that script can be imported as a module in another script and used.
What Is If_Name_==main In Python?
The characteristics of Python files are that they either act as reusable modules or as standalone programs. if __name__ == main” function can execute some code only when the Python files run directly, they are not imported.
How To Setup A Main Method In Python?
To set up the “main method” in Python first define a function and then use the “if __name__ == ‘__main__’ ” condition for the execution of this function.
During this process, the python interpreter sets the __name__ value to the module name if the Python source file is imported as a module. The moment “if condition” returns a false condition then the main method will not be executed.
How To Call Main Function In Python?
An important thing to note is that any method executes only when it is called. To call the main function, an implicit variable is used such as _name_.
How To Define Main In Python?
In Python, there are two ways to define and call the main method. Let’s see both these implementations.
1. Define In The Same File
The first implementation shows the way to define the main method in the same file. Let’s see the following steps and understand how to do this:
This should be known that Python creates and sets the values of implicit variables at the time a program starts running. These variables do not require a data type for declaring them. The __name__ is this type of variable.
During the programming phase, the value of this __name__ variable is set to __main__.
Hence first the main() method is defined and then an “if condition” is used to run the main() method.
print(“How are you?”)
def main():
print(“What about you?”)
if __name__ == “__main__”:
main()
2. Imported From Another File
The second implementation shows how to define the main method imported from another file.
To understand this, let’s first understand what modules are. A module is a program that is imported into another file to use multiple times without writing the same code again and again.
Now look at the following steps:
First, import the module in the program file to be run.
Now equate the __name__ variable in the if condition to the name of the module (imported module).
Now see that the module code will run before the code in the file calling it.
def main():
print(“What about you?”)
if __name__ == “__main__”:
main()
Conclusion
Let’s conclude this article here, also check out this free course on spyder python. We are sure that after reading this article, you are now able to illustrate many important aspects such as what the main() function in Python is, how it can be used, and how, with the help of the main() function in Python, a ton of functionalities can be executed as and when needed, how the flow of execution can be controlled, etc. We hope that you will find this article relevant to you.
FAQs
When a Python program is run, the first thing seen is the Python main function. When a Python program runs, the function of the interpreter is to run the code sequentially and does not run the main function if imported as a module. The main function gets executed only when it runs as a Python program.
In Python, the main function acts as the point of execution for any program.
Python has no explicit main() function, however, it defines the execution point by other conventions, like the Python interpreter that runs each line serially from the top of the file.
Yes, the main method can be written in Python with the use of the “if __name__ == ‘__main__’ ” condition.
An if __name__ == “__main__” is a conditional statement or a block which is used to allow or prevent parts of code from being run when the modules are imported.
Decorators are known as one of the most helpful and powerful tools of Python. The behaviour of the function can be modified with the use of the decorators. Without any permanent modification, the working of a wrapped function can be expanded by wrapping another function, and this flexibility is provided by the decorators.
The examples of some decorators are as follows:
def divide(x,y):
print(x/y)
def outer_div(func):
def inner(x,y):
if(x
return func(x,y)
A Module in Python is a simple file that has a “. py” extension. It contains Python code that can be imported for use inside another Python Program.
Education
Top 145 Python Interview Questions for 2023- Great Learning – styxor.com
Published
2 days agoon
May 31, 2023By




Table of contents
- Python Interview Questions for Freshers
- 1. What is Python?
- 2. Why Python?
- 3. How to Install Python?
- 4. What are the applications of Python?
- 5. What are the advantages of Python?
- 6. What are the key features of Python?
- 7. What do you mean by Python literals?
- 8. What type of language is Python?
- 9. How is Python an interpreted language?
- 10. What is pep 8?
- 11. What is namespace in Python?
- 12. What is PYTHON PATH?
- 13. What are Python modules?
- 14. What are local variables and global variables in Python?
- 15. Explain what Flask is and its benefits?
- 16. Is Django better than Flask?
- 17. Mention the differences between Django, Pyramid, and Flask.
- 18. Discuss Django architecture
- 19. Explain Scope in Python?
- 20. List the common built-in data types in Python?
- 21. What are global, protected, and private attributes in Python?
- 22. What are Keywords in Python?
- 23. What is the difference between lists and tuples in Python?
- 24. How can you concatenate two tuples?
- 25. What are functions in Python?
- 26. How can you initialize a 5*5 numpy array with only zeroes?
- 27. What are Pandas?
- 28. What are data frames?
- 29. What is a Pandas Series?
- 30. What do you understand about pandas groupby?
- 31. How to create a dataframe from lists?
- 32. How to create a data frame from a dictionary?
- 33. How to combine dataframes in pandas?
- 34. What kind of joins does pandas offer?
- 35. How to merge dataframes in pandas?
- 36. Give the below dataframe drop all rows having Nan.
- 37. How to access the first five entries of a dataframe?
- 38. How to access the last five entries of a dataframe?
- 39. How to fetch a data entry from a pandas dataframe using a given value in index?
- 40. What are comments and how can you add comments in Python?
- 41. What is a dictionary in Python? Give an example.
- 42. What is the difference between a tuple and a dictionary?
- 43. Find out the mean, median and standard deviation of this numpy array -> np.array([1,5,3,100,4,48])
- 44. What is a classifier?
- 45. In Python how do you convert a string into lowercase?
- 46. How do you get a list of all the keys in a dictionary?
- 47. How can you capitalize the first letter of a string?
- 48. How can you insert an element at a given index in Python?
- 49. How will you remove duplicate elements from a list?
- 50. What is recursion?
- 51. Explain Python List Comprehension.
- 52. What is the bytes() function?
- 53. What are the different types of operators in Python?
- 54. What is the ‘with statement’?
- 55. What is a map() function in Python?
- 56. What is __init__ in Python?
- 57. What are the tools present to perform static analysis?
- 58. What is pass in Python?
- 59. How can an object be copied in Python?
- 60. How can a number be converted to a string?
Are you an aspiring Python Developer? A career in Python has seen an upward trend in 2023, and you can be a part of the ever-so-growing community. So, if you are ready to indulge yourself in the pool of knowledge and be prepared for the upcoming python interview, then you are at the right place.
We have compiled a comprehensive list of Python Interview Questions and Answers that will come in handy at the time of need. Once you are prepared with the questions we mentioned in our list, you will be ready to get into numerous python job roles like python Developer, Data scientist, Software Engineer, Database Administrator, Quality Assurance Tester, and more.
Python programming can achieve several functions with few lines of code and supports powerful computations using powerful libraries. Due to these factors, there is an increase in demand for professionals with Python programming knowledge. Check out the free python course to learn more
This blog covers the most commonly asked Python Interview Questions that will help you land great job offers.
Python Interview Questions for Freshers
This section on Python Interview Questions for freshers covers 70+ questions that are commonly asked during the interview process. As a fresher, you may be new to the interview process; however, learning these questions will help you answer the interviewer confidently and ace your upcoming interview.
1. What is Python?
Python was created and first released in 1991 by Guido van Rossum. It is a high-level, general-purpose programming language emphasizing code readability and providing easy-to-use syntax. Several developers and programmers prefer using Python for their programming needs due to its simplicity. After 30 years, Van Rossum stepped down as the leader of the community in 2018.
Python interpreters are available for many operating systems. CPython, the reference implementation of Python, is open-source software and has a community-based development model, as do nearly all of its variant implementations. The non-profit Python Software Foundation manages Python and CPython.
2. Why Python?
Python is a high-level, general-purpose programming language. Python is a programming language that may be used to create desktop GUI apps, websites, and online applications. As a high-level programming language, Python also allows you to concentrate on the application’s essential functionality while handling routine programming duties. The basic grammar limitations of the programming language make it considerably easier to maintain the code base intelligible and the application manageable.
3. How to Install Python?
To Install Python, go to Anaconda.org and click on “Download Anaconda”. Here, you can download the latest version of Python. After Python is installed, it is a pretty straightforward process. The next step is to power up an IDE and start coding in Python. If you wish to learn more about the process, check out this Python Tutorial. Check out How to install python.
Check out this pictorial representation of python installation.
4. What are the applications of Python?
Python is notable for its general-purpose character, which allows it to be used in practically any software development sector. Python may be found in almost every new field. It is the most popular programming language and may be used to create any application.
– Web Applications
We can use Python to develop web applications. It contains HTML and XML libraries, JSON libraries, email processing libraries, request libraries, beautiful soup libraries, Feedparser libraries, and other internet protocols. Instagram uses Django, a Python web framework.
– Desktop GUI Applications
The Graphical User Interface (GUI) is a user interface that allows for easy interaction with any programme. Python contains the Tk GUI framework for creating user interfaces.
– Console-based Application
The command-line or shell is used to execute console-based programmes. These are computer programmes that are used to carry out orders. This type of programme was more common in the previous generation of computers. It is well-known for its REPL, or Read-Eval-Print Loop, which makes it ideal for command-line applications.
Python has a number of free libraries and modules that help in the creation of command-line applications. To read and write, the appropriate IO libraries are used. It has capabilities for processing parameters and generating console help text built-in. There are additional advanced libraries that may be used to create standalone console applications.
– Software Development
Python is useful for the software development process. It’s a support language that may be used to establish control and management, testing, and other things.
- SCons are used to build control.
- Continuous compilation and testing are automated using Buildbot and Apache Gumps.
– Scientific and Numeric
This is the time of artificial intelligence, in which a machine can execute tasks as well as a person can. Python is an excellent programming language for artificial intelligence and machine learning applications. It has a number of scientific and mathematical libraries that make doing difficult computations simple.
Putting machine learning algorithms into practice requires a lot of arithmetic. Numpy, Pandas, Scipy, Scikit-learn, and other scientific and numerical Python libraries are available. If you know how to use Python, you’ll be able to import libraries on top of the code. A few prominent machine library frameworks are listed below.
– Business Applications
Standard apps are not the same as business applications. This type of program necessitates a lot of scalability and readability, which Python gives.
Oddo is a Python-based all-in-one application that offers a wide range of business applications. The commercial application is built on the Tryton platform, which is provided by Python.
– Audio or Video-based Applications
Python is a versatile programming language that may be used to construct multimedia applications. TimPlayer, cplay, and other multimedia programmes written in Python are examples.
– 3D CAD Applications
Engineering-related architecture is designed using CAD (Computer-aided design). It’s used to create a three-dimensional visualization of a system component. The following features in Python can be used to develop a 3D CAD application:
- Fandango (Popular)
- CAMVOX
- HeeksCNC
- AnyCAD
- RCAM
– Enterprise Applications
Python may be used to develop apps for usage within a business or organization. OpenERP, Tryton, Picalo all these real-time applications are examples.
– Image Processing Application
Python has a lot of libraries for working with pictures. The picture can be altered to our specifications. OpenCV, Pillow, and SimpleITK are all image processing libraries present in python. In this topic, we’ve covered a wide range of applications in which Python plays a critical part in their development. We’ll study more about Python principles in the upcoming tutorial.
5. What are the advantages of Python?
Python is a general-purpose dynamic programming language that is high-level and interpreted. Its architectural framework prioritizes code readability and utilizes indentation extensively.
- Third-party modules are present.
- Several support libraries are available (NumPy for numerical calculations, Pandas for data analytics, etc)
- Community development and open source
- Adaptable, simple to read, learn, and write
- Data structures that are pretty easy to work on
- High-level language
- The language that is dynamically typed (No need to mention data type based on the value assigned, it takes data type)
- Object-oriented programming language
- Interactive and transportable
- Ideal for prototypes since it allows you to add additional features with minimal code.
- Highly Effective
- Internet of Things (IoT) Possibilities
- Portable Interpreted Language across Operating Systems
- Since it is an interpreted language it executes any code line by line and throws an error if it finds something missing.
- Python is free to use and has a large open-source community.
- Python has a lot of support for libraries that provide numerous functions for doing any task at hand.
- One of the best features of Python is its portability: it can and does run on any platform without having to change the requirements.
- Provides a lot of functionality in lesser lines of code compared to other programming languages like Java, C++, etc.
Crack Your Python Interview
6. What are the key features of Python?
Python is one of the most popular programming languages used by data scientists and AIML professionals. This popularity is due to the following key features of Python:
- Python is easy to learn due to its clear syntax and readability
- Python is easy to interpret, making debugging easy
- Python is free and Open-source
- It can be used across different languages
- It is an object-oriented language that supports concepts of classes
- It can be easily integrated with other languages like C++, Java, and more
7. What do you mean by Python literals?
A literal is a simple and direct form of expressing a value. Literals reflect the primitive type options available in that language. Integers, floating-point numbers, Booleans, and character strings are some of the most common forms of literal. Python supports the following literals:
Literals in Python relate to the data that is kept in a variable or constant. There are several types of literals present in Python
String Literals: It’s a sequence of characters wrapped in a set of codes. Depending on the number of quotations used, there can be single, double, or triple strings. Single characters enclosed by single or double quotations are known as character literals.
Numeric Literals: These are unchangeable numbers that may be divided into three types: integer, float, and complex.
Boolean Literals: True or False, which signify ‘1’ and ‘0,’ respectively, can be assigned to them.
Special Literals: It’s used to categorize fields that have not been generated. ‘None’ is the value that is used to represent it.
- String literals: “halo” , ‘12345’
- Int literals: 0,1,2,-1,-2
- Long literals: 89675L
- Float literals: 3.14
- Complex literals: 12j
- Boolean literals: True or False
- Special literals: None
- Unicode literals: u”hello”
- List literals: [], [5, 6, 7]
- Tuple literals: (), (9,), (8, 9, 0)
- Dict literals: {}, {‘x’:1}
- Set literals: {8, 9, 10}
8. What type of language is Python?
Python is an interpreted, interactive, object-oriented programming language. Classes, modules, exceptions, dynamic typing, and extremely high-level dynamic data types are all present.
Python is an interpreted language with dynamic typing. Because the code is not converted to a binary form, these languages are sometimes referred to as “scripting” languages. While I say dynamically typed, I’m referring to the fact that types don’t have to be stated when coding; the interpreter finds them out at runtime.
The readability of Python’s concise, easy-to-learn syntax is prioritized, lowering software maintenance costs. Python provides modules and packages, allowing for programme modularity and code reuse. The Python interpreter and its comprehensive standard library are free to download and distribute in source or binary form for all major platforms.
9. How is Python an interpreted language?
An interpreter takes your code and executes (does) the actions you provide, produces the variables you specify, and performs a lot of behind-the-scenes work to ensure it works smoothly or warns you about issues.
Python is not an interpreted or compiled language. The implementation’s attribute is whether it is interpreted or compiled. Python is a bytecode (a collection of interpreter-readable instructions) that may be interpreted in a variety of ways.
The source code is saved in a .py file.
Python generates a set of instructions for a virtual machine from the source code. This intermediate format is known as “bytecode,” and it is created by compiling.py source code into .pyc, which is bytecode. This bytecode can then be interpreted by the standard CPython interpreter or PyPy’s JIT (Just in Time compiler).
Python is known as an interpreted language because it uses an interpreter to convert the code you write into a language that your computer’s processor can understand. You will later download and utilise the Python interpreter to be able to create Python code and execute it on your own computer when working on a project.
10. What is pep 8?
PEP 8, often known as PEP8 or PEP-8, is a document that outlines best practices and recommendations for writing Python code. It was written in 2001 by Guido van Rossum, Barry Warsaw, and Nick Coghlan. The main goal of PEP 8 is to make Python code more readable and consistent.
Python Enhancement Proposal (PEP) is an acronym for Python Enhancement Proposal, and there are numerous of them. A Python Enhancement Proposal (PEP) is a document that explains new features suggested for Python and details elements of Python for the community, such as design and style.
11. What is namespace in Python?
In Python, a namespace is a system that assigns a unique name to each and every object. A variable or a method might be considered an object. Python has its own namespace, which is kept in the form of a Python dictionary. Let’s look at a directory-file system structure in a computer as an example. It should go without saying that a file with the same name might be found in numerous folders. However, by supplying the absolute path of the file, one may be routed to it if desired.
A namespace is essentially a technique for ensuring that all of the names in a programme are distinct and may be used interchangeably. You may already be aware that everything in Python is an object, including strings, lists, functions, and so on. Another notable thing is that Python uses dictionaries to implement namespaces. A name-to-object mapping exists, with the names serving as keys and the objects serving as values. The same name can be used by many namespaces, each mapping it to a distinct object. Here are a few namespace examples:
Local Namespace: This namespace stores the local names of functions. This namespace is created when a function is invoked and only lives till the function returns.
Global Namespace: Names from various imported modules that you are utilizing in a project are stored in this namespace. It’s formed when the module is added to the project and lasts till the script is completed.
Built-in Namespace: This namespace contains the names of built-in functions and exceptions.
12. What is PYTHON PATH?
PYTHONPATH is an environment variable that allows the user to add additional folders to the sys.path directory list for Python. In a nutshell, it is an environment variable that is set before the start of the Python interpreter.
13. What are Python modules?
A Python module is a collection of Python commands and definitions in a single file. In a module, you may specify functions, classes, and variables. A module can also include executable code. When code is organized into modules, it is easier to understand and use. It also logically organizes the code.
14. What are local variables and global variables in Python?
Local variables are declared inside a function and have a scope that is confined to that function alone, whereas global variables are defined outside of any function and have a global scope. To put it another way, local variables are only available within the function in which they were created, but global variables are accessible across the programme and throughout each function.
Local Variables
Local variables are variables that are created within a function and are exclusive to that function. Outside of the function, it can’t be accessed.
Global Variables
Global variables are variables that are defined outside of any function and are available throughout the programme, that is, both inside and outside of each function.
15. Explain what Flask is and its benefits?
Flask is an open-source web framework. Flask is a set of tools, frameworks, and technologies for building online applications. A web page, a wiki, a huge web-based calendar software, or a commercial website is used to build this web app. Flask is a micro-framework, which means it doesn’t rely on other libraries too much.
Benefits:
There are several compelling reasons to utilize Flask as a web application framework. Like-
- Unit testing support that is incorporated
- There’s a built-in development server as well as a rapid debugger.
- Restful request dispatch with a Unicode basis
- The use of cookies is permitted.
- Templating WSGI 1.0 compatible jinja2
- Additionally, the flask gives you complete control over the progress of your project.
- HTTP request processing function
- Flask is a lightweight and versatile web framework that can be easily integrated with a few extensions.
- You may use your favorite device to connect. The main API for ORM Basic is well-designed and organized.
- Extremely adaptable
- In terms of manufacturing, the flask is easy to use.
16. Is Django better than Flask?
Django is more popular because it has plenty of functionality out of the box, making complicated applications easier to build. Django is best suited for larger projects with a lot of features. The features may be overkill for lesser applications.
If you’re new to web programming, Flask is a fantastic place to start. Many websites are built with Flask and receive a lot of traffic, although not as much as Django-based websites. If you want precise control, you should use flask, whereas a Django developer relies on a large community to produce unique websites.
17. Mention the differences between Django, Pyramid, and Flask.
Flask is a “micro framework” designed for smaller applications with less requirements. Pyramid and Django are both geared at larger projects, but they approach extension and flexibility in different ways.
A pyramid is designed to be flexible, allowing the developer to use the best tools for their project. This means that the developer may choose the database, URL structure, templating style, and other options. Django aspires to include all of the batteries that a web application would require, so programmers simply need to open the box and start working, bringing in Django’s many components as they go.
Django includes an ORM by default, but Pyramid and Flask provide the developer control over how (and whether) their data is stored. SQLAlchemy is the most popular ORM for non-Django web apps, but there are lots of alternative options, ranging from DynamoDB and MongoDB to simple local persistence like LevelDB or regular SQLite. Pyramid is designed to work with any sort of persistence layer, even those that have yet to be conceived.
Django | Pyramid | Flask |
It is a python framework. | It is the same as Django | It is a micro-framework. |
It is used to build large applications. | It is the same as Django | It is used to create a small application. |
It includes an ORM. | It provides flexibility and the right tools. | It does not require external libraries. |
18. Discuss Django architecture
Django has an MVC (Model-View-Controller) architecture, which is divided into three parts:
1. Model
The Model, which is represented by a database, is the logical data structure that underpins the whole programme (generally relational databases such as MySql, Postgres).
2. View
The View is the user interface, or what you see when you visit a website in your browser. HTML/CSS/Javascript files are used to represent them.
3. Controller
The Controller is the link between the view and the model, and it is responsible for transferring data from the model to the view.
Your application will revolve around the model using MVC, either displaying or altering it.
19. Explain Scope in Python?
Think of scope as the father of a family; every object works within a scope. A formal definition would be this is a block of code under which no matter how many objects you declare they remain relevant. A few examples of the same are given below:
- Local Scope: When you create a variable inside a function that belongs to the local scope of that function itself and it will only be used inside that function.
Example:
def harshit_fun():
y = 100
print (y)
harshit_func()
100
- Global Scope: When a variable is created inside the main body of python code, it is called the global scope. The best part about global scope is they are accessible within any part of the python code from any scope be it global or local.
Example:
y = 100
def harshit_func():
print (y)
harshit_func()
print (y)
- Nested Function: This is also known as a function inside a function, as stated in the example above in local scope variable y is not available outside the function but within any function inside another function.
Example:
def first_func():
y = 100
def nested_func1():
print(y)
nested_func1()
first_func()
- Module Level Scope: This essentially refers to the global objects of the current module accessible within the program.
- Outermost Scope: This is a reference to all the built-in names that you can call in the program.
20. List the common built-in data types in Python?
Given below are the most commonly used built-in datatypes :
Numbers: Consists of integers, floating-point numbers, and complex numbers.
List: We have already seen a bit about lists, to put a formal definition a list is an ordered sequence of items that are mutable, also the elements inside lists can belong to different data types.
Example:
list = [100, “Great Learning”, 30]
Tuples: This too is an ordered sequence of elements but unlike lists tuples are immutable meaning it cannot be changed once declared.
Example:
tup_2 = (100, “Great Learning”, 20)
String: This is called the sequence of characters declared within single or double quotes.
Example:
“Hi, I work at great learning”
‘Hi, I work at great learning’
Sets: Sets are basically collections of unique items where order is not uniform.
Example:
set = {1,2,3}
Dictionary: A dictionary always stores values in key and value pairs where each value can be accessed by its particular key.
Example:
[12] harshit = {1:’video_games’, 2:’sports’, 3:’content’}
Boolean: There are only two boolean values: True and False
21. What are global, protected, and private attributes in Python?
The attributes of a class are also called variables. There are three access modifiers in Python for variables, namely
a. public – The variables declared as public are accessible everywhere, inside or outside the class.
b. private – The variables declared as private are accessible only within the current class.
c. protected – The variables declared as protected are accessible only within the current package.
Attributes are also classified as:
– Local attributes are defined within a code-block/method and can be accessed only within that code-block/method.
– Global attributes are defined outside the code-block/method and can be accessible everywhere.
class Mobile:
m1 = "Samsung Mobiles" //Global attributes
def price(self):
m2 = "Costly mobiles" //Local attributes
return m2
Sam_m = Mobile()
print(Sam_m.m1)
22. What are Keywords in Python?
Keywords in Python are reserved words that are used as identifiers, function names, or variable names. They help define the structure and syntax of the language.
There are a total of 33 keywords in Python 3.7 which can change in the next version, i.e., Python 3.8. A list of all the keywords is provided below:
Keywords in Python:
False | class | finally | is | return |
None | continue | for | lambda | try |
True | def | from | nonlocal | while |
and | del | global | not | with |
as | elif | if | or | yield |
assert | else | import | pass | |
break | except |
23. What is the difference between lists and tuples in Python?
List and tuple are data structures in Python that may store one or more objects or values. Using square brackets, you may build a list to hold numerous objects in one variable. Tuples, like arrays, may hold numerous items in a single variable and are defined with parenthesis.
Lists | Tuples |
Lists are mutable. | Tuples are immutable. |
The impacts of iterations are Time Consuming. | Iterations have the effect of making things go faster. |
The list is more convenient for actions like insertion and deletion. | The items may be accessed using the tuple data type. |
Lists take up more memory. | When compared to a list, a tuple uses less memory. |
There are numerous techniques built into lists. | There aren’t many built-in methods in Tuple. |
Changes and faults that are unexpected are more likely to occur. | It is difficult to take place in a tuple. |
They consume a lot of memory given the nature of this data structure | They consume less memory |
Syntax: list = [100, “Great Learning”, 30] | Syntax: tup_2 = (100, “Great Learning”, 20) |
24. How can you concatenate two tuples?
Let’s say we have two tuples like this ->
tup1 = (1,”a”,True)
tup2 = (4,5,6)
Concatenation of tuples means that we are adding the elements of one tuple at the end of another tuple.
Now, let’s go ahead and concatenate tuple2 with tuple1:
Code:
tup1=(1,"a",True)
tup2=(4,5,6)
tup1+tup2
All you have to do is, use the ‘+’ operator between the two tuples and you’ll get the concatenated result.
Similarly, let’s concatenate tuple1 with tuple2:
Code:
tup1=(1,"a",True)
tup2=(4,5,6)
tup2+tup1
25. What are functions in Python?
Ans: Functions in Python refer to blocks that have organized, and reusable codes to perform single, and related events. Functions are important to create better modularity for applications that reuse a high degree of coding. Python has a number of built-in functions like print(). However, it also allows you to create user-defined functions.
26. How can you initialize a 5*5 numpy array with only zeroes?
We will be using the .zeros() method.
import numpy as np
n1=np.zeros((5,5))
n1
Use np.zeros() and pass in the dimensions inside it. Since we want a 5*5 matrix, we will pass (5,5) inside the .zeros() method.
27. What are Pandas?
Pandas is an open-source python library that has a very rich set of data structures for data-based operations. Pandas with their cool features fit in every role of data operation, whether it be academics or solving complex business problems. Pandas can deal with a large variety of files and are one of the most important tools to have a grip on.
Learn More About Python Pandas
28. What are data frames?
A pandas dataframe is a data structure in pandas that is mutable. Pandas have support for heterogeneous data which is arranged across two axes. ( rows and columns).
Reading files into pandas:-
12 | Import pandas as pddf=p.read_csv(“mydata.csv”) |
Here, df is a pandas data frame. read_csv() is used to read a comma-delimited file as a dataframe in pandas.
29. What is a Pandas Series?
Series is a one-dimensional panda’s data structure that can data of almost any type. It resembles an excel column. It supports multiple operations and is used for single-dimensional data operations.
Creating a series from data:
Code:
import pandas as pd
data=["1",2,"three",4.0]
series=pd.Series(data)
print(series)
print(type(series))
30. What do you understand about pandas groupby?
A pandas groupby is a feature supported by pandas that are used to split and group an object. Like the sql/mysql/oracle groupby it is used to group data by classes, and entities which can be further used for aggregation. A dataframe can be grouped by one or more columns.
Code:
df = pd.DataFrame({'Vehicle':['Etios','Lamborghini','Apache200','Pulsar200'], 'Type':["car","car","motorcycle","motorcycle"]})
df
To perform groupby type the following code:
df.groupby('Type').count()
31. How to create a dataframe from lists?
To create a dataframe from lists,
1) create an empty dataframe
2) add lists as individuals columns to the list
Code:
df=pd.DataFrame()
bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"]
df["cars"]=cars
df["bikes"]=bikes
df
32. How to create a data frame from a dictionary?
A dictionary can be directly passed as an argument to the DataFrame() function to create the data frame.
Code:
import pandas as pd
bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"]
d={"cars":cars,"bikes":bikes}
df=pd.DataFrame(d)
df
33. How to combine dataframes in pandas?
Two different data frames can be stacked either horizontally or vertically by the concat(), append(), and join() functions in pandas.
Concat works best when the data frames have the same columns and can be used for concatenation of data having similar fields and is basically vertical stacking of dataframes into a single dataframe.
Append() is used for horizontal stacking of data frames. If two tables(dataframes) are to be merged together then this is the best concatenation function.
Join is used when we need to extract data from different dataframes which are having one or more common columns. The stacking is horizontal in this case.
Before going through the questions, here’s a quick video to help you refresh your memory on Python.
34. What kind of joins does pandas offer?
Pandas have a left join, inner join, right join, and outer join.
35. How to merge dataframes in pandas?
Merging depends on the type and fields of different dataframes being merged. If data has similar fields data is merged along axis 0 else they are merged along axis 1.
36. Give the below dataframe drop all rows having Nan.
The dropna function can be used to do that.
df.dropna(inplace=True)
df
37. How to access the first five entries of a dataframe?
By using the head(5) function we can get the top five entries of a dataframe. By default df.head() returns the top 5 rows. To get the top n rows df.head(n) will be used.
38. How to access the last five entries of a dataframe?
By using the tail(5) function we can get the top five entries of a dataframe. By default df.tail() returns the top 5 rows. To get the last n rows df.tail(n) will be used.
39. How to fetch a data entry from a pandas dataframe using a given value in index?
To fetch a row from a dataframe given index x, we can use loc.
Df.loc[10] where 10 is the value of the index.
Code:
import pandas as pd
bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"]
d={"cars":cars,"bikes":bikes}
df=pd.DataFrame(d)
a=[10,20,30,40,50]
df.index=a
df.loc[10]
40. What are comments and how can you add comments in Python?
Comments in Python refer to a piece of text intended for information. It is especially relevant when more than one person works on a set of codes. It can be used to analyse code, leave feedback, and debug it. There are two types of comments which includes:
- Single-line comment
- Multiple-line comment
Codes needed for adding a comment
#Note –single line comment
“””Note
Note
Note”””—–multiline comment
41. What is a dictionary in Python? Give an example.
A Python dictionary is a collection of items in no particular order. Python dictionaries are written in curly brackets with keys and values. Dictionaries are optimised to retrieve values for known keys.
Example
d={“a”:1,”b”:2}
42. What is the difference between a tuple and a dictionary?
One major difference between a tuple and a dictionary is that a dictionary is mutable while a tuple is not. Meaning the content of a dictionary can be changed without changing its identity, but in a tuple, that’s not possible.
43. Find out the mean, median and standard deviation of this numpy array -> np.array([1,5,3,100,4,48])
import numpy as np
n1=np.array([10,20,30,40,50,60])
print(np.mean(n1))
print(np.median(n1))
print(np.std(n1))
44. What is a classifier?
A classifier is used to predict the class of any data point. Classifiers are special hypotheses that are used to assign class labels to any particular data point. A classifier often uses training data to understand the relation between input variables and the class. Classification is a method used in supervised learning in Machine Learning.
45. In Python how do you convert a string into lowercase?
All the upper cases in a string can be converted into lowercase by using the method: string.lower()
ex:
string = ‘GREATLEARNING’ print(string.lower())
o/p: greatlearning
46. How do you get a list of all the keys in a dictionary?
One of the ways we can get a list of keys is by using: dict.keys()
This method returns all the available keys in the dictionary.
dict = {1:a, 2:b, 3:c} dict.keys()
o/p: [1, 2, 3]
47. How can you capitalize the first letter of a string?
We can use the capitalize() function to capitalize the first character of a string. If the first character is already in the capital then it returns the original string.
Syntax:
ex:
n = “greatlearning” print(n.capitalize())
o/p: Greatlearning
48. How can you insert an element at a given index in Python?
Python has an inbuilt function called the insert() function.
It can be used used to insert an element at a given index.
Syntax:
list_name.insert(index, element)
ex:
list = [ 0,1, 2, 3, 4, 5, 6, 7 ]
#insert 10 at 6th index
list.insert(6, 10)
o/p: [0,1,2,3,4,5,10,6,7]
49. How will you remove duplicate elements from a list?
There are various methods to remove duplicate elements from a list. But, the most common one is, converting the list into a set by using the set() function and using the list() function to convert it back to a list if required.
ex:
list0 = [2, 6, 4, 7, 4, 6, 7, 2]
list1 = list(set(list0)) print (“The list without duplicates : ” + str(list1))
o/p: The list without duplicates : [2, 4, 6, 7]
50. What is recursion?
Recursion is a function calling itself one or more times in it body. One very important condition a recursive function should have to be used in a program is, it should terminate, else there would be a problem of an infinite loop.
51. Explain Python List Comprehension.
List comprehensions are used for transforming one list into another list. Elements can be conditionally included in the new list and each element can be transformed as needed. It consists of an expression leading to a for clause, enclosed in brackets.
For ex:
list = [i for i in range(1000)]
print list
52. What is the bytes() function?
The bytes() function returns a bytes object. It is used to convert objects into bytes objects or create empty bytes objects of the specified size.
53. What are the different types of operators in Python?
Python has the following basic operators:
Arithmetic (Addition(+), Substraction(-), Multiplication(*), Division(/), Modulus(%) ), Relational (<, >, <=, >=, ==, !=, ),
Assignment (=. +=, -=, /=, *=, %= ),
Logical (and, or not ), Membership, Identity, and Bitwise Operators
54. What is the ‘with statement’?
The “with” statement in python is used in exception handling. A file can be opened and closed while executing a block of code, containing the “with” statement., without using the close() function. It essentially makes the code much easier to read.
55. What is a map() function in Python?
The map() function in Python is used for applying a function on all elements of a specified iterable. It consists of two parameters, function and iterable. The function is taken as an argument and then applied to all the elements of an iterable(passed as the second argument). An object list is returned as a result.
def add(n):
return n + n number= (15, 25, 35, 45)
res= map(add, num)
print(list(res))
o/p: 30,50,70,90
56. What is __init__ in Python?
_init_ methodology is a reserved method in Python aka constructor in OOP. When an object is created from a class and _init_ methodology is called to access the class attributes.
Also Read: Python __init__- An Overview
57. What are the tools present to perform static analysis?
The two static analysis tools used to find bugs in Python are Pychecker and Pylint. Pychecker detects bugs from the source code and warns about its style and complexity. While Pylint checks whether the module matches upto a coding standard.
58. What is pass in Python?
Pass is a statement that does nothing when executed. In other words, it is a Null statement. This statement is not ignored by the interpreter, but the statement results in no operation. It is used when you do not want any command to execute but a statement is required.
59. How can an object be copied in Python?
Not all objects can be copied in Python, but most can. We can use the “=” operator to copy an object to a variable.
ex:
var=copy.copy(obj)
60. How can a number be converted to a string?
The inbuilt function str() can be used to convert a number to a string.
61. What are modules and packages in Python?
Modules are the way to structure a program. Each Python program file is a module, importing other attributes and objects. The folder of a program is a package of modules. A package can have modules or subfolders.
62. What is the object() function in Python?
In Python, the object() function returns an empty object. New properties or methods cannot be added to this object.
63. What is the difference between NumPy and SciPy?
NumPy stands for Numerical Python while SciPy stands for Scientific Python. NumPy is the basic library for defining arrays and simple mathematical problems, while SciPy is used for more complex problems like numerical integration and optimization and machine learning and so on.
64. What does len() do?
len() is used to determine the length of a string, a list, an array, and so on.
ex:
str = “greatlearning”
print(len(str))
o/p: 13
65. Define encapsulation in Python?
Encapsulation means binding the code and the data together. A Python class for example.
66. What is the type () in Python?
type() is a built-in method that either returns the type of the object or returns a new type of object based on the arguments passed.
ex:
a = 100
type(a)
o/p: int
67. What is the split() function used for?
Split function is used to split a string into shorter strings using defined separators.
letters= ('' A, B, C”)
n = text.split(“,”)
print(n)
o/p: [‘A’, ‘B’, ‘C’ ]
68. What are the built-in types does python provide?
Python has following built-in data types:
Numbers: Python identifies three types of numbers:
- Integer: All positive and negative numbers without a fractional part
- Float: Any real number with floating-point representation
- Complex numbers: A number with a real and imaginary component represented as x+yj. x and y are floats and j is -1(square root of -1 called an imaginary number)
Boolean: The Boolean data type is a data type that has one of two possible values i.e. True or False. Note that ‘T’ and ‘F’ are capital letters.
String: A string value is a collection of one or more characters put in single, double or triple quotes.
List: A list object is an ordered collection of one or more data items that can be of different types, put in square brackets. A list is mutable and thus can be modified, we can add, edit or delete individual elements in a list.
Set: An unordered collection of unique objects enclosed in curly brackets
Frozen set: They are like a set but immutable, which means we cannot modify their values once they are created.
Dictionary: A dictionary object is unordered in which there is a key associated with each value and we can access each value through its key. A collection of such pairs is enclosed in curly brackets. For example {‘First Name’: ’Tom’, ’last name’: ’Hardy’} Note that Number values, strings, and tuples are immutable while List or Dictionary objects are mutable.
69. What is docstring in Python?
Python docstrings are the string literals enclosed in triple quotes that appear right after the definition of a function, method, class, or module. These are generally used to describe the functionality of a particular function, method, class, or module. We can access these docstrings using the __doc__ attribute.
Here is an example:
def square(n):
'''Takes in a number n, returns the square of n'''
return n**2
print(square.__doc__)
Ouput: Takes in a number n, returns the square of n.
70. How to Reverse a String in Python?
In Python, there are no in-built functions that help us reverse a string. We need to make use of an array slicing operation for the same.
1 | str_reverse = string[::-1] |
Learn more: How To Reverse a String In Python
71. How to check the Python Version in CMD?
To check the Python Version in CMD, press CMD + Space. This opens Spotlight. Here, type “terminal” and press enter. To execute the command, type python –version or python -V and press enter. This will return the python version in the next line below the command.
72. Is Python case sensitive when dealing with identifiers?
Yes. Python is case-sensitive when dealing with identifiers. It is a case-sensitive language. Thus, variable and Variable would not be the same.
Python Interview Questions for Experienced
This section on Python Interview Questions for Experienced covers 20+ questions that are commonly asked during the interview process for landing a job as a Python experienced professional. These commonly asked questions can help you brush up your skills and know what to expect in your upcoming interviews.
73. How to create a new column in pandas by using values from other columns?
We can perform column based mathematical operations on a pandas dataframe. Pandas columns containing numeric values can be operated upon by operators.
Code:
import pandas as pd
a=[1,2,3]
b=[2,3,5]
d={"col1":a,"col2":b}
df=pd.DataFrame(d)
df["Sum"]=df["col1"]+df["col2"]
df["Difference"]=df["col1"]-df["col2"]
df
Output:
74. What are the different functions that can be used by grouby in pandas ?
grouby() in pandas can be used with multiple aggregate functions. Some of which are sum(),mean(), count(),std().
Data is divided into groups based on categories and then the data in these individual groups can be aggregated by the aforementioned functions.
75. How to delete a column or group of columns in pandas? Given the below dataframe drop column “col1”.
drop() function can be used to delete the columns from a dataframe.
d={"col1":[1,2,3],"col2":["A","B","C"]}
df=pd.DataFrame(d)
df=df.drop(["col1"],axis=1)
df
76. Given the following data frame drop rows having column values as A.
Code:
d={"col1":[1,2,3],"col2":["A","B","C"]}
df=pd.DataFrame(d)
df.dropna(inplace=True)
df=df[df.col1!=1]
df
77. What is Reindexing in pandas?
Reindexing is the process of re-assigning the index of a pandas dataframe.
Code:
import pandas as pd
bikes=["bajaj","tvs","herohonda","kawasaki","bmw"]
cars=["lamborghini","masserati","ferrari","hyundai","ford"]
d={"cars":cars,"bikes":bikes}
df=pd.DataFrame(d)
a=[10,20,30,40,50]
df.index=a
df
78. What do you understand about the lambda function? Create a lambda function which will print the sum of all the elements in this list -> [5, 8, 10, 20, 50, 100]
Lambda functions are anonymous functions in Python. They are defined using the keyword lambda. Lambda functions can take any number of arguments, but they can only have one expression.
from functools import reduce
sequences = [5, 8, 10, 20, 50, 100]
sum = reduce (lambda x, y: x+y, sequences)
print(sum)
79. What is vstack() in numpy? Give an example.
vstack() is a function to align rows vertically. All rows must have the same number of elements.
Code:
import numpy as np
n1=np.array([10,20,30,40,50])
n2=np.array([50,60,70,80,90])
print(np.vstack((n1,n2)))
80. How to remove spaces from a string in Python?
Spaces can be removed from a string in python by using strip() or replace() functions. Strip() function is used to remove the leading and trailing white spaces while the replace() function is used to remove all the white spaces in the string:
string.replace(” “,””) ex1: str1= “great learning”
print (str.strip())
o/p: great learning
ex2: str2=”great learning”
print (str.replace(” “,””))
o/p: greatlearning
81. Explain the file processing modes that Python supports.
There are three file processing modes in Python: read-only(r), write-only(w), read-write(rw) and append (a). So, if you are opening a text file in say, read mode. The preceding modes become “rt” for read-only, “wt” for write and so on. Similarly, a binary file can be opened by specifying “b” along with the file accessing flags (“r”, “w”, “rw” and “a”) preceding it.
82. What is pickling and unpickling?
Pickling is the process of converting a Python object hierarchy into a byte stream for storing it into a database. It is also known as serialization. Unpickling is the reverse of pickling. The byte stream is converted back into an object hierarchy.
83. How is memory managed in Python?
This is one of the most commonly asked python interview questions
Memory management in python comprises a private heap containing all objects and data structure. The heap is managed by the interpreter and the programmer does not have access to it at all. The Python memory manager does all the memory allocation. Moreover, there is an inbuilt garbage collector that recycles and frees memory for the heap space.
84. What is unittest in Python?
Unittest is a unit testing framework in Python. It supports sharing of setup and shutdown code for tests, aggregation of tests into collections,test automation, and independence of the tests from the reporting framework.
85. How do you delete a file in Python?
Files can be deleted in Python by using the command os.remove (filename) or os.unlink(filename)
86. How do you create an empty class in Python?
To create an empty class we can use the pass command after the definition of the class object. A pass is a statement in Python that does nothing.
87. What are Python decorators?
Decorators are functions that take another function as an argument to modify its behavior without changing the function itself. These are useful when we want to dynamically increase the functionality of a function without changing it.
Here is an example:
def smart_divide(func):
def inner(a, b):
print("Dividing", a, "by", b)
if b == 0:
print("Make sure Denominator is not zero")
return
return func(a, b)
return inner
@smart_divide
def divide(a, b):
print(a/b)
divide(1,0)
Here smart_divide is a decorator function that is used to add functionality to simple divide function.
88. What is a dynamically typed language?
Type checking is an important part of any programming language which is about ensuring minimum type errors. The type defined for variables are checked either at compile-time or run-time. When the type-check is done at compile time then it is called static typed language and when the type check is done at run time, it’s called dynamically typed language.
- In dynamic typed language the objects are bound with type by assignments at run time.
- Dynamically typed programming languages produce less optimized code comparatively
- In dynamically typed languages, types for variables need not be defined before using them. Hence, it can be allocated dynamically.
89. What is slicing in Python?
Slicing in Python refers to accessing parts of a sequence. The sequence can be any mutable and iterable object. slice( ) is a function used in Python to divide the given sequence into required segments.
There are two variations of using the slice function. Syntax for slicing in python:
- slice(start,stop)
- silica(start, stop, step)
Ex:
Str1 = ("g", "r", "e", "a", "t", "l", "e", "a", “r”, “n”, “i”, “n”, “g”)
substr1 = slice(3, 5)
print(Str1[substr1])
//same code can be written in the following way also
Str1 = ("g", "r", "e", "a", "t", "l", "e", "a", “r”, “n”, “i”, “n”, “g”)
print(Str1[3,5])
Str1 = ("g", "r", "e", "a", "t", "l", "e", "a", “r”, “n”, “i”, “n”, “g”)
substr1 = slice(0, 14, 2)
print(Str1[substr1])
//same code can be written in the following way also
Str1 = ("g", "r", "e", "a", "t", "l", "e", "a", “r”, “n”, “i”, “n”, “g”)
print(Str1[0,14, 2])
90. What is the difference between Python Arrays and lists?
Python Arrays and List both are ordered collections of elements and are mutable, but the difference lies in working with them
Arrays store heterogeneous data when imported from the array module, but arrays can store homogeneous data imported from the numpy module. But lists can store heterogeneous data, and to use lists, it doesn’t have to be imported from any module.
import array as a1
array1 = a1.array('i', [1 , 2 ,5] )
print (array1)
Or,
import numpy as a2
array2 = a2.array([5, 6, 9, 2])
print(array2)
- Arrays have to be declared before using it but lists need not be declared.
- Numerical operations are easier to do on arrays as compared to lists.
91. What is Scope Resolution in Python?
The variable’s accessibility is defined in python according to the location of the variable declaration, called the scope of variables in python. Scope Resolution refers to the order in which these variables are looked for a name to variable matching. Following is the scope defined in python for variable declaration.
a. Local scope – The variable declared inside a loop, the function body is accessible only within that function or loop.
b. Global scope – The variable is declared outside any other code at the topmost level and is accessible everywhere.
c. Enclosing scope – The variable is declared inside an enclosing function, accessible only within that enclosing function.
d. Built-in Scope – The variable declared inside the inbuilt functions of various modules of python has the built-in scope and is accessible only within that particular module.
The scope resolution for any variable is made in java in a particular order, and that order is
Local Scope -> enclosing scope -> global scope -> built-in scope
92. What are Dict and List comprehensions?
List comprehensions provide a more compact and elegant way to create lists than for-loops, and also a new list can be created from existing lists.
The syntax used is as follows:
Or,
a for a in iterator if condition
Ex:
list1 = [a for a in range(5)]
print(list1)
list2 = [a for a in range(5) if a < 3]
print(list2)
Dictionary comprehensions provide a more compact and elegant way to create a dictionary, and also, a new dictionary can be created from existing dictionaries.
The syntax used is:
{key: expression for an item in iterator}
Ex:
dict([(i, i*2) for i in range(5)])
93. What is the difference between xrange and range in Python?
range() and xrange() are inbuilt functions in python used to generate integer numbers in the specified range. The difference between the two can be understood if python version 2.0 is used because the python version 3.0 xrange() function is re-implemented as the range() function itself.
With respect to python 2.0, the difference between range and xrange function is as follows:
- range() takes more memory comparatively
- xrange(), execution speed is faster comparatively
- range () returns a list of integers and xrange() returns a generator object.
Example:
for i in range(1,10,2):
print(i)
94. What is the difference between .py and .pyc files?
.py are the source code files in python that the python interpreter interprets.
.pyc are the compiled files that are bytecodes generated by the python compiler, but .pyc files are only created for inbuilt modules/files.
Python Programming Interview Questions
Apart from having theoretical knowledge, having practical experience and knowing programming interview questions is a crucial part of the interview process. It helps the recruiters understand your hands-on experience. These are 45+ of the most commonly asked Python programming interview questions.
Here is a pictorial representation of how to generate the python programming output.
95. You have this covid-19 dataset below:
This is one of the most commonly asked python interview questions
From this dataset, how will you make a bar-plot for the top 5 states having maximum confirmed cases as of 17=07-2020?
sol:
#keeping only required columns
df = df[[‘Date’, ‘State/UnionTerritory’,’Cured’,’Deaths’,’Confirmed’]]
#renaming column names
df.columns = [‘date’, ‘state’,’cured’,’deaths’,’confirmed’]
#current date
today = df[df.date == ‘2020-07-17’]
#Sorting data w.r.t number of confirmed cases
max_confirmed_cases=today.sort_values(by=”confirmed”,ascending=False)
max_confirmed_cases
#Getting states with maximum number of confirmed cases
top_states_confirmed=max_confirmed_cases[0:5]
#Making bar-plot for states with top confirmed cases
sns.set(rc={‘figure.figsize’:(15,10)})
sns.barplot(x=”state”,y=”confirmed”,data=top_states_confirmed,hue=”state”)
plt.show()
Code explanation:
We start off by taking only the required columns with this command:
df = df[[‘Date’, ‘State/UnionTerritory’,’Cured’,’Deaths’,’Confirmed’]]
Then, we go ahead and rename the columns:
df.columns = [‘date’, ‘state’,’cured’,’deaths’,’confirmed’]
After that, we extract only those records, where the date is equal to 17th July:
today = df[df.date == ‘2020-07-17’]
Then, we go ahead and select the top 5 states with maximum no. of covid cases:
max_confirmed_cases=today.sort_values(by=”confirmed”,ascending=False)
max_confirmed_cases
top_states_confirmed=max_confirmed_cases[0:5]
Finally, we go ahead and make a bar-plot with this:
sns.set(rc={‘figure.figsize’:(15,10)})
sns.barplot(x=”state”,y=”confirmed”,data=top_states_confirmed,hue=”state”)
plt.show()
Here, we are using the seaborn library to make the bar plot. The “State” column is mapped onto the x-axis and the “confirmed” column is mapped onto the y-axis. The color of the bars is determined by the “state” column.
96. From this covid-19 dataset:
How can you make a bar plot for the top 5 states with the most amount of deaths?
max_death_cases=today.sort_values(by=”deaths”,ascending=False)
max_death_cases
sns.set(rc={‘figure.figsize’:(15,10)})
sns.barplot(x=”state”,y=”deaths”,data=top_states_death,hue=”state”)
plt.show()
Code Explanation:
We start off by sorting our dataframe in descending order w.r.t the “deaths” column:
max_death_cases=today.sort_values(by=”deaths”,ascending=False)
Max_death_cases
Then, we go ahead and make the bar-plot with the help of seaborn library:
sns.set(rc={‘figure.figsize’:(15,10)})
sns.barplot(x=”state”,y=”deaths”,data=top_states_death,hue=”state”)
plt.show()
Here, we are mapping the “state” column onto the x-axis and the “deaths” column onto the y-axis.
97. From this covid-19 dataset:
How can you make a line plot indicating the confirmed cases with respect to date?
Sol:
maha = df[df.state == ‘Maharashtra’]
sns.set(rc={‘figure.figsize’:(15,10)})
sns.lineplot(x=”date”,y=”confirmed”,data=maha,color=”g”)
plt.show()
Code Explanation:
We start off by extracting all the records where the state is equal to “Maharashtra”:
maha = df[df.state == ‘Maharashtra’]
Then, we go ahead and make a line-plot using seaborn library:
sns.set(rc={‘figure.figsize’:(15,10)})
sns.lineplot(x=”date”,y=”confirmed”,data=maha,color=”g”)
plt.show()
Here, we map the “date” column onto the x-axis and the “confirmed” column onto the y-axis.
98. On this “Maharashtra” dataset:
How will you implement a linear regression algorithm with “date” as the independent variable and “confirmed” as the dependent variable? That is you have to predict the number of confirmed cases w.r.t date.
from sklearn.model_selection import train_test_split
maha[‘date’]=maha[‘date’].map(dt.datetime.toordinal)
maha.head()
x=maha[‘date’]
y=maha[‘confirmed’]
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3)
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(np.array(x_train).reshape(-1,1),np.array(y_train).reshape(-1,1))
lr.predict(np.array([[737630]]))
Code solution:
We will start off by converting the date to ordinal type:
from sklearn.model_selection import train_test_split
maha[‘date’]=maha[‘date’].map(dt.datetime.toordinal)
This is done because we cannot build the linear regression algorithm on top of the date column.
Then, we go ahead and divide the dataset into train and test sets:
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3)
Finally, we go ahead and build the model:
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(np.array(x_train).reshape(-1,1),np.array(y_train).reshape(-1,1))
lr.predict(np.array([[737630]]))
99. On this customer_churn dataset:
This is one of the most commonly asked python interview questions
Build a Keras sequential model to find out how many customers will churn out on the basis of tenure of customer?
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(12, input_dim=1, activation=’relu’))
model.add(Dense(8, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’))
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
model.fit(x_train, y_train, epochs=150,validation_data=(x_test,y_test))
y_pred = model.predict_classes(x_test)
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test,y_pred)
Code explanation:
We will start off by importing the required libraries:
from Keras.models import Sequential
from Keras.layers import Dense
Then, we go ahead and build the structure of the sequential model:
model = Sequential()
model.add(Dense(12, input_dim=1, activation=’relu’))
model.add(Dense(8, activation=’relu’))
model.add(Dense(1, activation=’sigmoid’))
Finally, we will go ahead and predict the values:
model.compile(loss=’binary_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
model.fit(x_train, y_train, epochs=150,validation_data=(x_test,y_test))
y_pred = model.predict_classes(x_test)
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test,y_pred)
100. On this iris dataset:
Build a decision tree classification model, where the dependent variable is “Species” and the independent variable is “Sepal.Length”.
y = iris[[‘Species’]]
x = iris[[‘Sepal.Length’]]
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.4)
from sklearn.tree import DecisionTreeClassifier
dtc = DecisionTreeClassifier()
dtc.fit(x_train,y_train)
y_pred=dtc.predict(x_test)
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test,y_pred)
(22+7+9)/(22+2+0+7+7+11+1+1+9)
Code explanation:
We start off by extracting the independent variable and dependent variable:
y = iris[[‘Species’]]
x = iris[[‘Sepal.Length’]]
Then, we go ahead and divide the data into train and test set:
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.4)
After that, we go ahead and build the model:
from sklearn.tree import DecisionTreeClassifier
dtc = DecisionTreeClassifier()
dtc.fit(x_train,y_train)
y_pred=dtc.predict(x_test)
Finally, we build the confusion matrix:
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test,y_pred)
(22+7+9)/(22+2+0+7+7+11+1+1+9)
101. On this iris dataset:
Build a decision tree regression model where the independent variable is “petal length” and dependent variable is “Sepal length”.
x= iris[[‘Petal.Length’]]
y = iris[[‘Sepal.Length’]]
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.25)
from sklearn.tree import DecisionTreeRegressor
dtr = DecisionTreeRegressor()
dtr.fit(x_train,y_train)
y_pred=dtr.predict(x_test)
y_pred[0:5]
from sklearn.metrics import mean_squared_error
mean_squared_error(y_test,y_pred)
102. How will you scrape data from the website “cricbuzz”?
import sys
import time
from bs4 import BeautifulSoup
import requests
import pandas as pd
try:
#use the browser to get the url. This is suspicious command that might blow up.
page=requests.get(‘cricbuzz.com’) # this might throw an exception if something goes wrong.
except Exception as e: # this describes what to do if an exception is thrown
error_type, error_obj, error_info = sys.exc_info() # get the exception information
print (‘ERROR FOR LINK:’,url) #print the link that cause the problem
print (error_type, ‘Line:’, error_info.tb_lineno) #print error info and line that threw the exception
#ignore this page. Abandon this and go back.
time.sleep(2)
soup=BeautifulSoup(page.text,’html.parser’)
links=soup.find_all(‘span’,attrs={‘class’:’w_tle’})
links
for i in links:
print(i.text)
print(“\n”)
103. Write a user-defined function to implement the central-limit theorem. You have to implement the central limit theorem on this “insurance” dataset:
You also have to build two plots on “Sampling Distribution of BMI” and “Population distribution of BMI”.
df = pd.read_csv(‘insurance.csv’)
series1 = df.charges
series1.dtype
def central_limit_theorem(data,n_samples = 1000, sample_size = 500, min_value = 0, max_value = 1338):
“”” Use this function to demonstrate Central Limit Theorem.
data = 1D array, or a pd.Series
n_samples = number of samples to be created
sample_size = size of the individual sample
min_value = minimum index of the data
max_value = maximum index value of the data “””
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
b = {}
for i in range(n_samples):
x = np.unique(np.random.randint(min_value, max_value, size = sample_size)) # set of random numbers with a specific size
b[i] = data[x].mean() # Mean of each sample
c = pd.DataFrame()
c[‘sample’] = b.keys() # Sample number
c[‘Mean’] = b.values() # mean of that particular sample
plt.figure(figsize= (15,5))
plt.subplot(1,2,1)
sns.distplot(c.Mean)
plt.title(f”Sampling Distribution of bmi. \n \u03bc = {round(c.Mean.mean(), 3)} & SE = {round(c.Mean.std(),3)}”)
plt.xlabel(‘data’)
plt.ylabel(‘freq’)
plt.subplot(1,2,2)
sns.distplot(data)
plt.title(f”population Distribution of bmi. \n \u03bc = {round(data.mean(), 3)} & \u03C3 = {round(data.std(),3)}”)
plt.xlabel(‘data’)
plt.ylabel(‘freq’)
plt.show()
central_limit_theorem(series1,n_samples = 5000, sample_size = 500)
Code Explanation:
We start off by importing the insurance.csv file with this command:
df = pd.read_csv(‘insurance.csv’)
Then we go ahead and define the central limit theorem method:
def central_limit_theorem(data,n_samples = 1000, sample_size = 500, min_value = 0, max_value = 1338):
This method comprises of these parameters:
- Data
- N_samples
- Sample_size
- Min_value
- Max_value
Inside this method, we import all the required libraries:
mport pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
Then, we go ahead and create the first sub-plot for “Sampling distribution of bmi”:
plt.subplot(1,2,1)
sns.distplot(c.Mean)
plt.title(f”Sampling Distribution of bmi. \n \u03bc = {round(c.Mean.mean(), 3)} & SE = {round(c.Mean.std(),3)}”)
plt.xlabel(‘data’)
plt.ylabel(‘freq’)
Finally, we create the sub-plot for “Population distribution of BMI”:
plt.subplot(1,2,2)
sns.distplot(data)
plt.title(f”population Distribution of bmi. \n \u03bc = {round(data.mean(), 3)} & \u03C3 = {round(data.std(),3)}”)
plt.xlabel(‘data’)
plt.ylabel(‘freq’)
plt.show()
104. Write code to perform sentiment analysis on amazon reviews:
This is one of the most commonly asked python interview questions.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.python.keras import models, layers, optimizers
import tensorflow
from tensorflow.keras.preprocessing.text import Tokenizer, text_to_word_sequence
from tensorflow.keras.preprocessing.sequence import pad_sequences
import bz2
from sklearn.metrics import f1_score, roc_auc_score, accuracy_score
import re
%matplotlib inline
def get_labels_and_texts(file):
labels = []
texts = []
for line in bz2.BZ2File(file):
x = line.decode(“utf-8”)
labels.append(int(x[9]) – 1)
texts.append(x[10:].strip())
return np.array(labels), texts
train_labels, train_texts = get_labels_and_texts(‘train.ft.txt.bz2’)
test_labels, test_texts = get_labels_and_texts(‘test.ft.txt.bz2’)
Train_labels[0]
Train_texts[0]
train_labels=train_labels[0:500]
train_texts=train_texts[0:500]
import re
NON_ALPHANUM = re.compile(r'[\W]’)
NON_ASCII = re.compile(r'[^a-z0-1\s]’)
def normalize_texts(texts):
normalized_texts = []
for text in texts:
lower = text.lower()
no_punctuation = NON_ALPHANUM.sub(r’ ‘, lower)
no_non_ascii = NON_ASCII.sub(r”, no_punctuation)
normalized_texts.append(no_non_ascii)
return normalized_texts
train_texts = normalize_texts(train_texts)
test_texts = normalize_texts(test_texts)
from sklearn.feature_extraction.text import CountVectorizer
cv = CountVectorizer(binary=True)
cv.fit(train_texts)
X = cv.transform(train_texts)
X_test = cv.transform(test_texts)
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(
X, train_labels, train_size = 0.75)
for c in [0.01, 0.05, 0.25, 0.5, 1]:
lr = LogisticRegression(C=c)
lr.fit(X_train, y_train)
print (“Accuracy for C=%s: %s”
% (c, accuracy_score(y_val, lr.predict(X_val))))
lr.predict(X_test[29])
105. Implement a probability plot using numpy and matplotlib:
sol:
import numpy as np
import pylab
import scipy.stats as stats
from matplotlib import pyplot as plt
n1=np.random.normal(loc=0,scale=1,size=1000)
np.percentile(n1,100)
n1=np.random.normal(loc=20,scale=3,size=100)
stats.probplot(n1,dist=”norm”,plot=pylab)
plt.show()
106. Implement multiple linear regression on this iris dataset:
The independent variables should be “Sepal.Width”, “Petal.Length”, “Petal.Width”, while the dependent variable should be “Sepal.Length”.
Sol:
import pandas as pd
iris = pd.read_csv(“iris.csv”)
iris.head()
x = iris[[‘Sepal.Width’,’Petal.Length’,’Petal.Width’]]
y = iris[[‘Sepal.Length’]]
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.35)
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(x_train, y_train)
y_pred = lr.predict(x_test)
from sklearn.metrics import mean_squared_error
mean_squared_error(y_test, y_pred)
Code solution:
We start off by importing the required libraries:
import pandas as pd
iris = pd.read_csv(“iris.csv”)
iris.head()
Then, we will go ahead and extract the independent variables and dependent variable:
x = iris[[‘Sepal.Width’,’Petal.Length’,’Petal.Width’]]
y = iris[[‘Sepal.Length’]]
Following which, we divide the data into train and test sets:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.35)
Then, we go ahead and build the model:
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(x_train, y_train)
y_pred = lr.predict(x_test)
Finally, we will find out the mean squared error:
from sklearn.metrics import mean_squared_error
mean_squared_error(y_test, y_pred)
107. From this credit fraud dataset:
Find the percentage of transactions that are fraudulent and not fraudulent. Also build a logistic regression model, to find out if the transaction is fraudulent or not.
Sol:
nfcount=0
notFraud=data_df[‘Class’]
for i in range(len(notFraud)):
if notFraud[i]==0:
nfcount=nfcount+1
nfcount
per_nf=(nfcount/len(notFraud))*100
print(‘percentage of total not fraud transaction in the dataset: ‘,per_nf)
fcount=0
Fraud=data_df[‘Class’]
for i in range(len(Fraud)):
if Fraud[i]==1:
fcount=fcount+1
fcount
per_f=(fcount/len(Fraud))*100
print(‘percentage of total fraud transaction in the dataset: ‘,per_f)
x=data_df.drop([‘Class’], axis = 1)#drop the target variable
y=data_df[‘Class’]
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size = 0.2, random_state = 42)
logisticreg = LogisticRegression()
logisticreg.fit(xtrain, ytrain)
y_pred = logisticreg.predict(xtest)
accuracy= logisticreg.score(xtest,ytest)
cm = metrics.confusion_matrix(ytest, y_pred)
print(cm)
108. Implement a simple CNN on the MNIST dataset using Keras. Following this, also add in drop-out layers.
Sol:
from __future__ import absolute_import, division, print_function
import numpy as np
# import keras
from tensorflow.keras.datasets import cifar10, mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten, Reshape
from tensorflow.keras.layers import Convolution2D, MaxPooling2D
from tensorflow.keras import utils
import pickle
from matplotlib import pyplot as plt
import seaborn as sns
plt.rcParams[‘figure.figsize’] = (15, 8)
%matplotlib inline
# Load/Prep the Data
(x_train, y_train_num), (x_test, y_test_num) = mnist.load_data()
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1).astype(‘float32’)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1).astype(‘float32’)
x_train /= 255
x_test /= 255
y_train = utils.to_categorical(y_train_num, 10)
y_test = utils.to_categorical(y_test_num, 10)
print(‘— THE DATA —‘)
print(‘x_train shape:’, x_train.shape)
print(x_train.shape[0], ‘train samples’)
print(x_test.shape[0], ‘test samples’)
TRAIN = False
BATCH_SIZE = 32
EPOCHS = 1
# Define the Type of Model
model1 = tf.keras.Sequential()
# Flatten Imgaes to Vector
model1.add(Reshape((784,), input_shape=(28, 28, 1)))
# Layer 1
model1.add(Dense(128, kernel_initializer=’he_normal’, use_bias=True))
model1.add(Activation(“relu”))
# Layer 2
model1.add(Dense(10, kernel_initializer=’he_normal’, use_bias=True))
model1.add(Activation(“softmax”))
# Loss and Optimizer
model1.compile(loss=’categorical_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
# Store Training Results
early_stopping = keras.callbacks.EarlyStopping(monitor=’val_acc’, patience=10, verbose=1, mode=’auto’)
callback_list = [early_stopping]# [stats, early_stopping]
# Train the model
model1.fit(x_train, y_train, nb_epoch=EPOCHS, batch_size=BATCH_SIZE, validation_data=(x_test, y_test), callbacks=callback_list, verbose=True)
#drop-out layers:
# Define Model
model3 = tf.keras.Sequential()
# 1st Conv Layer
model3.add(Convolution2D(32, (3, 3), input_shape=(28, 28, 1)))
model3.add(Activation(‘relu’))
# 2nd Conv Layer
model3.add(Convolution2D(32, (3, 3)))
model3.add(Activation(‘relu’))
# Max Pooling
model3.add(MaxPooling2D(pool_size=(2,2)))
# Dropout
model3.add(Dropout(0.25))
# Fully Connected Layer
model3.add(Flatten())
model3.add(Dense(128))
model3.add(Activation(‘relu’))
# More Dropout
model3.add(Dropout(0.5))
# Prediction Layer
model3.add(Dense(10))
model3.add(Activation(‘softmax’))
# Loss and Optimizer
model3.compile(loss=’categorical_crossentropy’, optimizer=’adam’, metrics=[‘accuracy’])
# Store Training Results
early_stopping = tf.keras.callbacks.EarlyStopping(monitor=’val_acc’, patience=7, verbose=1, mode=’auto’)
callback_list = [early_stopping]
# Train the model
model3.fit(x_train, y_train, batch_size=BATCH_SIZE, nb_epoch=EPOCHS,
validation_data=(x_test, y_test), callbacks=callback_list)
109. Implement a popularity-based recommendation system on this movie lens dataset:
import os
import numpy as np
import pandas as pd
ratings_data = pd.read_csv(“ratings.csv”)
ratings_data.head()
movie_names = pd.read_csv(“movies.csv”)
movie_names.head()
movie_data = pd.merge(ratings_data, movie_names, on=’movieId’)
movie_data.groupby(‘title’)[‘rating’].mean().head()
movie_data.groupby(‘title’)[‘rating’].mean().sort_values(ascending=False).head()
movie_data.groupby(‘title’)[‘rating’].count().sort_values(ascending=False).head()
ratings_mean_count = pd.DataFrame(movie_data.groupby(‘title’)[‘rating’].mean())
ratings_mean_count.head()
ratings_mean_count[‘rating_counts’] = pd.DataFrame(movie_data.groupby(‘title’)[‘rating’].count())
ratings_mean_count.head()
110. Implement the naive Bayes algorithm on top of the diabetes dataset:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt # matplotlib.pyplot plots data
%matplotlib inline
import seaborn as sns
pdata = pd.read_csv(“pima-indians-diabetes.csv”)
columns = list(pdata)[0:-1] # Excluding Outcome column which has only
pdata[columns].hist(stacked=False, bins=100, figsize=(12,30), layout=(14,2));
# Histogram of first 8 columns
However, we want to see a correlation in graphical representation so below is the function for that:
def plot_corr(df, size=11):
corr = df.corr()
fig, ax = plt.subplots(figsize=(size, size))
ax.matshow(corr)
plt.xticks(range(len(corr.columns)), corr.columns)
plt.yticks(range(len(corr.columns)), corr.columns)
plot_corr(pdata)
from sklearn.model_selection import train_test_split
X = pdata.drop(‘class’,axis=1) # Predictor feature columns (8 X m)
Y = pdata[‘class’] # Predicted class (1=True, 0=False) (1 X m)
x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, random_state=1)
# 1 is just any random seed number
x_train.head()
from sklearn.naive_bayes import GaussianNB # using Gaussian algorithm from Naive Bayes
# creatw the model
diab_model = GaussianNB()
diab_model.fit(x_train, y_train.ravel())
diab_train_predict = diab_model.predict(x_train)
from sklearn import metrics
print(“Model Accuracy: {0:.4f}”.format(metrics.accuracy_score(y_train, diab_train_predict)))
print()
diab_test_predict = diab_model.predict(x_test)
from sklearn import metrics
print(“Model Accuracy: {0:.4f}”.format(metrics.accuracy_score(y_test, diab_test_predict)))
print()
print(“Confusion Matrix”)
cm=metrics.confusion_matrix(y_test, diab_test_predict, labels=[1, 0])
df_cm = pd.DataFrame(cm, index = [i for i in [“1″,”0”]],
columns = [i for i in [“Predict 1″,”Predict 0”]])
plt.figure(figsize = (7,5))
sns.heatmap(df_cm, annot=True)
111. How can you find the minimum and maximum values present in a tuple?
Solution ->
We can use the min() function on top of the tuple to find out the minimum value present in the tuple:
tup1=(1,2,3,4,5)
min(tup1)
Output
1
We see that the minimum value present in the tuple is 1.
Analogous to the min() function is the max() function, which will help us to find out the maximum value present in the tuple:
tup1=(1,2,3,4,5)
max(tup1)
Output
5
We see that the maximum value present in the tuple is 5.
112. If you have a list like this -> [1,”a”,2,”b”,3,”c”]. How can you access the 2nd, 4th and 5th elements from this list?
Solution ->
We will start off by creating a tuple that will comprise the indices of elements that we want to access.
Then, we will use a for loop to go through the index values and print them out.
Below is the entire code for the process:
indices = (1,3,4)
for i in indices:
print(a[i])
113. If you have a list like this -> [“sparta”,True,3+4j,False]. How would you reverse the elements of this list?
Solution ->
We can use the reverse() function on the list:
a.reverse()
a
114. If you have dictionary like this – > fruit={“Apple”:10,”Orange”:20,”Banana”:30,”Guava”:40}. How would you update the value of ‘Apple’ from 10 to 100?
Solution ->
This is how you can do it:
fruit["Apple"]=100
fruit
Give in the name of the key inside the parenthesis and assign it a new value.
115. If you have two sets like this -> s1 = {1,2,3,4,5,6}, s2 = {5,6,7,8,9}. How would you find the common elements in these sets.
Solution ->
You can use the intersection() function to find the common elements between the two sets:
s1 = {1,2,3,4,5,6}
s2 = {5,6,7,8,9}
s1.intersection(s2)
We see that the common elements between the two sets are 5 & 6.
116. Write a program to print out the 2-table using while loop.
Solution ->
Below is the code to print out the 2-table:
Code
i=1
n=2
while i<=10:
print(n,"*", i, "=", n*i)
i=i+1
Output
We start off by initializing two variables ‘i’ and ‘n’. ‘i’ is initialized to 1 and ‘n’ is initialized to ‘2’.
Inside the while loop, since the ‘i’ value goes from 1 to 10, the loop iterates 10 times.
Initially n*i is equal to 2*1, and we print out the value.
Then, ‘i’ value is incremented and n*i becomes 2*2. We go ahead and print it out.
This process goes on until i value becomes 10.
117. Write a function, which will take in a value and print out if it is even or odd.
Solution ->
The below code will do the job:
def even_odd(x):
if x%2==0:
print(x," is even")
else:
print(x, " is odd")
Here, we start off by creating a method, with the name ‘even_odd()’. This function takes a single parameter and prints out if the number taken is even or odd.
Now, let’s invoke the function:
even_odd(5)
We see that, when 5 is passed as a parameter into the function, we get the output -> ‘5 is odd’.
118. Write a python program to print the factorial of a number.
This is one of the most commonly asked python interview questions
Solution ->
Below is the code to print the factorial of a number:
factorial = 1
#check if the number is negative, positive or zero
if num<0:
print("Sorry, factorial does not exist for negative numbers")
elif num==0:
print("The factorial of 0 is 1")
else
for i in range(1,num+1):
factorial = factorial*i
print("The factorial of",num,"is",factorial)
We start off by taking an input which is stored in ‘num’. Then, we check if ‘num’ is less than zero and if it is actually less than 0, we print out ‘Sorry, factorial does not exist for negative numbers’.
After that, we check,if ‘num’ is equal to zero, and it that’s the case, we print out ‘The factorial of 0 is 1’.
On the other hand, if ‘num’ is greater than 1, we enter the for loop and calculate the factorial of the number.
119. Write a python program to check if the number given is a palindrome or not
Solution ->
Below is the code to Check whether the given number is palindrome or not:
n=int(input("Enter number:"))
temp=n
rev=0
while(n>0)
dig=n%10
rev=rev*10+dig
n=n//10
if(temp==rev):
print("The number is a palindrome!")
else:
print("The number isn't a palindrome!")
We will start off by taking an input and store it in ‘n’ and make a duplicate of it in ‘temp’. We will also initialize another variable ‘rev’ to 0.
Then, we will enter a while loop which will go on until ‘n’ becomes 0.
Inside the loop, we will start off by dividing ‘n’ with 10 and then store the remainder in ‘dig’.
Then, we will multiply ‘rev’ with 10 and then add ‘dig’ to it. This result will be stored back in ‘rev’.
Going ahead, we will divide ‘n’ by 10 and store the result back in ‘n’
Once the for loop ends, we will compare the values of ‘rev’ and ‘temp’. If they are equal, we will print ‘The number is a palindrome’, else we will print ‘The number isn’t a palindrome’.
120. Write a python program to print the following pattern ->
This is one of the most commonly asked python interview questions:
1
2 2
3 3 3
4 4 4 4
5 5 5 5 5
Solution ->
Below is the code to print this pattern:
#10 is the total number to print
for num in range(6):
for i in range(num):
print(num,end=" ")#print number
#new line after each row to display pattern correctly
print("\n")
We are solving the problem with the help of nested for loop. We will have an outer for loop, which goes from 1 to 5. Then, we have an inner for loop, which would print the respective numbers.
121. Pattern questions. Print the following pattern
#
# #
# # #
# # # #
# # # # #
Solution –>
def pattern_1(num):
# outer loop handles the number of rows
# inner loop handles the number of columns
# n is the number of rows.
for i in range(0, n):
# value of j depends on i
for j in range(0, i+1):
# printing hashes
print("#",end="")
# ending line after each row
print("\r")
num = int(input("Enter the number of rows in pattern: "))
pattern_1(num)
122. Print the following pattern.
#
# #
# # #
# # # #
# # # # #
Solution –>
Code:
def pattern_2(num):
# define the number of spaces
k = 2*num - 2
# outer loop always handles the number of rows
# let us use the inner loop to control the number of spaces
# we need the number of spaces as maximum initially and then decrement it after every iteration
for i in range(0, num):
for j in range(0, k):
print(end=" ")
# decrementing k after each loop
k = k - 2
# reinitializing the inner loop to keep a track of the number of columns
# similar to pattern_1 function
for j in range(0, i+1):
print("# ", end="")
# ending line after each row
print("\r")
num = int(input("Enter the number of rows in pattern: "))
pattern_2(num)
123. Print the following pattern:
0
0 1
0 1 2
0 1 2 3
0 1 2 3 4
Solution –>
Code:
def pattern_3(num):
# initialising starting number
number = 1
# outer loop always handles the number of rows
# let us use the inner loop to control the number
for i in range(0, num):
# re assigning number after every iteration
# ensure the column starts from 0
number = 0
# inner loop to handle number of columns
for j in range(0, i+1):
# printing number
print(number, end=" ")
# increment number column wise
number = number + 1
# ending line after each row
print("\r")
num = int(input("Enter the number of rows in pattern: "))
pattern_3(num)
124. Print the following pattern:
1
2 3
4 5 6
7 8 9 10
11 12 13 14 15
Solution –>
Code:
def pattern_4(num):
# initialising starting number
number = 1
# outer loop always handles the number of rows
# let us use the inner loop to control the number
for i in range(0, num):
# commenting the reinitialization part ensure that numbers are printed continuously
# ensure the column starts from 0
number = 0
# inner loop to handle number of columns
for j in range(0, i+1):
# printing number
print(number, end=" ")
# increment number column wise
number = number + 1
# ending line after each row
print("\r")
num = int(input("Enter the number of rows in pattern: "))
pattern_4(num)
125. Print the following pattern:
A
B B
C C C
D D D D
Solution –>
def pattern_5(num):
# initializing value of A as 65
# ASCII value equivalent
number = 65
# outer loop always handles the number of rows
for i in range(0, num):
# inner loop handles the number of columns
for j in range(0, i+1):
# finding the ascii equivalent of the number
char = chr(number)
# printing char value
print(char, end=" ")
# incrementing number
number = number + 1
# ending line after each row
print("\r")
num = int(input("Enter the number of rows in pattern: "))
pattern_5(num)
126. Print the following pattern:
A
B C
D E F
G H I J
K L M N O
P Q R S T U
Solution –>
def pattern_6(num):
# initializing value equivalent to 'A' in ASCII
# ASCII value
number = 65
# outer loop always handles the number of rows
for i in range(0, num):
# inner loop to handle number of columns
# values changing acc. to outer loop
for j in range(0, i+1):
# explicit conversion of int to char
# returns character equivalent to ASCII.
char = chr(number)
# printing char value
print(char, end=" ")
# printing the next character by incrementing
number = number +1
# ending line after each row
print("\r")
num = int(input("enter the number of rows in the pattern: "))
pattern_6(num)
127. Print the following pattern
#
# #
# # #
# # # #
# # # # #
Solution –>
Code:
def pattern_7(num):
# number of spaces is a function of the input num
k = 2*num - 2
# outer loop always handle the number of rows
for i in range(0, num):
# inner loop used to handle the number of spaces
for j in range(0, k):
print(end=" ")
# the variable holding information about number of spaces
# is decremented after every iteration
k = k - 1
# inner loop reinitialized to handle the number of columns
for j in range(0, i+1):
# printing hash
print("# ", end="")
# ending line after each row
print("\r")
num = int(input("Enter the number of rows: "))
pattern_7(n)
128. If you have a dictionary like this -> d1={“k1″:10,”k2″:20,”k3”:30}. How would you increment values of all the keys ?
d1={"k1":10,"k2":20,"k3":30}
for i in d1.keys():
d1[i]=d1[i]+1
129. How can you get a random number in python?
Ans. To generate a random, we use a random module of python. Here are some examples To generate a floating-point number from 0-1
import random
n = random.random()
print(n)
To generate a integer between a certain range (say from a to b):
import random
n = random.randint(a,b)
print(n)
130. Explain how you can set up the Database in Django.
All of the project’s settings, as well as database connection information, are contained in the settings.py file. Django works with the SQLite database by default, but it may be configured to operate with other databases as well.
Database connectivity necessitates full connection information, including the database name, user credentials, hostname, and drive name, among other things.
To connect to MySQL and establish a connection between the application and the database, use the django.db.backends.mysql driver.
All connection information must be included in the settings file. Our project’s settings.py file has the following code for the database.
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'djangoApp',
'USER':'root',
'PASSWORD':'mysql',
'HOST':'localhost',
'PORT':'3306'
}
}
This command will build tables for admin, auth, contenttypes, and sessions. You may now connect to the MySQL database by selecting it from the database drop-down menu.
131. Give an example of how you can write a VIEW in Django?
The Django MVT Structure is incomplete without Django Views. A view function is a Python function that receives a Web request and delivers a Web response, according to the Django manual. This response might be a web page’s HTML content, a redirect, a 404 error, an XML document, an image, or anything else that a web browser can display.
The HTML/CSS/JavaScript in your Template files is converted into what you see in your browser when you show a web page using Django views, which are part of the user interface. (Do not combine Django views with MVC views if you’ve used other MVC (Model-View-Controller) frameworks.) In Django, the views are similar.
# import Http Response from django
from django.http import HttpResponse
# get datetime
import datetime
# create a function
def geeks_view(request):
# fetch date and time
now = datetime.datetime.now()
# convert to string
html = "Time is {}".format(now)
# return response
return HttpResponse(html)
132. Explain the use of sessions in the Django framework?
Django (and much of the Internet) uses sessions to track the “status” of a particular site and browser. Sessions allow you to save any amount of data per browser and make it available on the site each time the browser connects. The data elements of the session are then indicated by a “key”, which can be used to save and recover the data.
Django uses a cookie with a single character ID to identify any browser and its website associated with the website. Session data is stored in the site’s database by default (this is safer than storing the data in a cookie, where it is more vulnerable to attackers).
Django allows you to store session data in a variety of locations (cache, files, “safe” cookies), but the default location is a solid and secure choice.
Enabling sessions
When we built the skeleton website, sessions were enabled by default.
The config is set up in the project file (locallibrary/locallibrary/settings.py) under the INSTALLED_APPS and MIDDLEWARE sections, as shown below:
INSTALLED_APPS = [
...
'django.contrib.sessions',
....
MIDDLEWARE = [
...
'django.contrib.sessions.middleware.SessionMiddleware',
…
Using sessions
The request parameter gives you access to the view’s session property (an HttpRequest passed in as the first argument to the view). The session id in the browser’s cookie for this site identifies the particular connection to the current user (or, to be more accurate, the connection to the current browser).
The session assets is a dictionary-like item that you can examine and write to as frequently as you need on your view, updating it as you go. You may do all of the standard dictionary actions, such as clearing all data, testing for the presence of a key, looping over data, and so on. Most of the time, though, you’ll merely obtain and set values using the usual “dictionary” API.
The code segments below demonstrate how to obtain, change, and remove data linked with the current session using the key “my bike” (browser).
Note: One of the best things about Django is that you don’t have to worry about the mechanisms that you think are connecting the session to the current request. If we were to use the fragments below in our view, we’d know that the information about my_bike is associated only with the browser that sent the current request.
# Get a session value via its key (for example ‘my_bike’), raising a KeyError if the key is not present
my_bike= request.session[‘my_bike’]
# Get a session value, setting a default value if it is not present ( ‘mini’)
my_bike= request.session.get(‘my_bike’, ‘mini’)
# Set a session value
request.session[‘my_bike’] = ‘mini’
# Delete a session value
del request.session[‘my_bike’]
A variety of different methods are available in the API, most of which are used to control the linked session cookie. There are ways to verify whether the client browser supports cookies, to set and check cookie expiration dates, and to delete expired sessions from the data store, for example. How to utilise sessions has further information on the whole API (Django docs).
133. List out the inheritance styles in Django.
Abstract base classes: This inheritance pattern is used by developers when they want the parent class to keep data that they don’t want to type out for each child model.
models.py
from django.db import models
# Create your models here.
class ContactInfo(models.Model):
name=models.CharField(max_length=20)
email=models.EmailField(max_length=20)
address=models.TextField(max_length=20)
class Meta:
abstract=True
class Customer(ContactInfo):
phone=models.IntegerField(max_length=15)
class Staff(ContactInfo):
position=models.CharField(max_length=10)
admin.py
admin.site.register(Customer)
admin.site.register(Staff)
Two tables are formed in the database when we transfer these modifications. We have fields for name, email, address, and phone in the Customer Table. We have fields for name, email, address, and position in Staff Table. Table is not a base class that is built in This inheritance.
Multi-table inheritance: It is utilised when you wish to subclass an existing model and have each of the subclasses have its own database table.
model.py
from django.db import models
# Create your models here.
class Place(models.Model):
name=models.CharField(max_length=20)
address=models.TextField(max_length=20)
def __str__(self):
return self.name
class Restaurants(Place):
serves_pizza=models.BooleanField(default=False)
serves_pasta=models.BooleanField(default=False)
def __str__(self):
return self.serves_pasta
admin.py
from django.contrib import admin
from .models import Place,Restaurants
# Register your models here.
admin.site.register(Place)
admin.site.register(Restaurants)
Proxy models: This inheritance approach allows the user to change the behaviour at the basic level without changing the model’s field.
This technique is used if you just want to change the model’s Python level behaviour and not the model’s fields. With the exception of fields, you inherit from the base class and can add your own properties.
- Abstract classes should not be used as base classes.
- Multiple inheritance is not possible in proxy models.
The main purpose of this is to replace the previous model’s key functions. It always uses overridden methods to query the original model.
134. How can you get the Google cache age of any URL or web page?
Use the URL
https://webcache.googleusercontent.com/search?q=cache:<your url without “http://”>
Example:
It contains a header like this:
This is Google’s cache of https://stackoverflow.com/. It’s a screenshot of the page as it looked at 11:33:38 GMT on August 21, 2012. In the meanwhile, the current page may have changed.
Tip: Use the find bar and press Ctrl+F or ⌘+F (Mac) to quickly find your search word on this page.
You’ll have to scrape the resultant page, however the most current cache page may be found at this URL:
http://webcache.googleusercontent.com/search?q=cache:www.something.com/path
The first div in the body tag contains Google information.
you can Use CachedPages website
Large enterprises with sophisticated web servers typically preserve and keep cached pages. Because such servers are often quite fast, a cached page can frequently be retrieved faster than the live website:
- A current copy of the page is generally kept by Google (1 to 15 days old).
- Coral also retains a current copy, although it isn’t as up to date as Google’s.
- You may access several versions of a web page preserved over time using Archive.org.
So, the next time you can’t access a website but still want to look at it, Google’s cache version could be a good option. First, determine whether or not age is important.
135. Briefly explain about Python namespaces?
A namespace in python talks about the name that is assigned to each object in Python. Namespaces are preserved in python like a dictionary where the key of the dictionary is the namespace and value is the address of that object.
Different types are as follows:
- Built-in-namespace – Namespaces containing all the built-in objects in python.
- Global namespace – Namespaces consisting of all the objects created when you call your main program.
- Enclosing namespace – Namespaces at the higher lever.
- Local namespace – Namespaces within local functions.
136. Briefly explain about Break, Pass and Continue statements in Python ?
Break: When we use a break statement in a python code/program it immediately breaks/terminates the loop and the control flow is given back to the statement after the body of the loop.
Continue: When we use a continue statement in a python code/program it immediately breaks/terminates the current iteration of the statement and also skips the rest of the program in the current iteration and controls flows to the next iteration of the loop.
Pass: When we use a pass statement in a python code/program it fills up the empty spots in the program.
Example:
GL = [10, 30, 20, 100, 212, 33, 13, 50, 60, 70]
for g in GL:
pass
if (g == 0):
current = g
break
elif(g%2==0):
continue
print(g) # output => 1 3 1 3 1
print(current)
137. Give me an example on how you can convert a list to a string?
Below given example will show how to convert a list to a string. When we convert a list to a string we can make use of the “.join” function to do the same.
fruits = [ ‘apple’, ‘orange’, ‘mango’, ‘papaya’, ‘guava’]
listAsString = ‘ ‘.join(fruits)
print(listAsString)
apple orange mango papaya guava
138. Give me an example where you can convert a list to a tuple?
The below given example will show how to convert a list to a tuple. When we convert a list to a tuple we can make use of the <tuple()> function but do remember since tuples are immutable we cannot convert it back to a list.
fruits = [‘apple’, ‘orange’, ‘mango’, ‘papaya’, ‘guava’]
listAsTuple = tuple(fruits)
print(listAsTuple)
(‘apple’, ‘orange’, ‘mango’, ‘papaya’, ‘guava’)
139. How do you count the occurrences of a particular element in the list ?
In the list data structure of python we count the number of occurrences of an element by using count() function.
fruits = [‘apple’, ‘orange’, ‘mango’, ‘papaya’, ‘guava’]
print(fruits.count(‘apple’))
Output: 1
140. How do you debug a python program?
There are several ways to debug a Python program:
- Using the
print
statement to print out variables and intermediate results to the console - Using a debugger like
pdb
oripdb
- Adding
assert
statements to the code to check for certain conditions
141. What is the difference between a list and a tuple in Python?
A list is a mutable data type, meaning it can be modified after it is created. A tuple is immutable, meaning it cannot be modified after it is created. This makes tuples faster and safer than lists, as they cannot be modified by other parts of the code accidentally.
142. How do you handle exceptions in Python?
Exceptions in Python can be handled using a try
–except
block. For example:
Copy codetry:
# code that may raise an exception
except SomeExceptionType:
# code to handle the exception
143. How do you reverse a string in Python?
There are several ways to reverse a string in Python:
- Using a slice with a step of -1:
Copy codestring = "abcdefg"
reversed_string = string[::-1]
- Using the
reversed
function:
Copy codestring = "abcdefg"
reversed_string = "".join(reversed(string))
Copy codestring = "abcdefg"
reversed_string = ""
for char in string:
reversed_string = char + reversed_string
144. How do you sort a list in Python?
There are several ways to sort a list in Python:
Copy codemy_list = [3, 4, 1, 2]
my_list.sort()
- Using the
sorted
function:
Copy codemy_list = [3, 4, 1, 2]
sorted_list = sorted(my_list)
- Using the
sort
function from theoperator
module:
Copy codefrom operator import itemgetter
my_list = [{"a": 3}, {"a": 1}, {"a": 2}]
sorted_list = sorted(my_list, key=itemgetter("a"))
145. How do you create a dictionary in Python?
There are several ways to create a dictionary in Python:
- Using curly braces and colons to separate keys and values:
Copy codemy_dict = {"key1": "value1", "key2": "value2"}
Copy codemy_dict = dict(key1="value1", key2="value2")
- Using the
dict
constructor:
Copy codemy_dict = dict({"key1": "value1", "key2": "value2"})
Ques 1. How do you stand out in a Python coding interview?
Now that you’re ready for a Python Interview in terms of technical skills, you must be wondering how to stand out from the crowd so that you’re the selected candidate. You must be able to show that you can write clean production codes and have knowledge about the libraries and tools required. If you’ve worked on any prior projects, then showcasing these projects in your interview will also help you stand out from the rest of the crowd.
Also Read: Top Common Interview Questions
Ques 2. How do I prepare for a Python interview?
To prepare for a Python Interview, you must know syntax, keywords, functions and classes, data types, basic coding, and exception handling. Having a basic knowledge of all the libraries and IDEs used and reading blogs related to Python Tutorial will help you. Showcase your example projects, brush up on your basic skills about algorithms, and maybe take up a free course on python data structures tutorial. This will help you stay prepared.
Ques 3. Are Python coding interviews very difficult?
The difficulty level of a Python Interview will vary depending on the role you are applying for, the company, their requirements, and your skill and knowledge/work experience. If you’re a beginner in the field and are not yet confident about your coding ability, you may feel that the interview is difficult. Being prepared and knowing what type of python interview questions to expect will help you prepare well and ace the interview.
Ques 4. How do I pass the Python coding interview?
Having adequate knowledge regarding Object Relational Mapper (ORM) libraries, Django or Flask, unit testing and debugging skills, fundamental design principles behind a scalable application, Python packages such as NumPy, Scikit learn are extremely important for you to clear a coding interview. You can showcase your previous work experience or coding ability through projects, this acts as an added advantage.
Also Read: How to build a Python Developers Resume
Ques 5. How do you debug a python program?
By using this command we can debug the program in the python terminal.
$ python -m pdb python-script.py
Ques 6. Which courses or certifications can help boost knowledge in Python?
With this, we have reached the end of the blog on top Python Interview Questions. If you wish to upskill, taking up a certificate course will help you gain the required knowledge. You can take up a python programming course and kick-start your career in Python.
Ultimate Posts
Trending
-
Education2 years ago
Placeholder text will be too distracting.
-
Education9 months ago
Top 90+ Tableau Interview Questions in 2022 [Updated] – styxor.com
-
Featured4 months ago
Owaisi targets Center for COVID-19 management, says ‘it went down despite scientists’ warning about 2nd wave’ – styxor.com
-
Jobs9 months ago
UR, SC, ST, OBC Expected Cutoff Marks – styxor.com
-
ताजा खबर1 year ago
सॉफ्टवेयर-परिभाषित वाइड एरिया नेटवर्क (एसडी-डब्ल्यूएएन) बाजार 2022 में 16.5 फीसदी की सीएजीआर से बढ़ने की उम्मीद है।
-
Education2 weeks ago
The Benefits of Continuous Learning for Career Growth – styxor.com
-
Technology2 years ago
The strength of lorem ipsum is its weakness: it doesn’t communicate.
-
Education4 months ago
What You Should Know About Machine Learning Engineer Salary in US? – styxor.com