hello_python_world #4: filtering dataframe
A subset is an extract for the data table or data frame base on a condition. We use filters to create subset of the data and the condition is a logical expression. Some examples of subset are: sales from an specific store - data['store_column_name'] == 'store01' – this logic could translate as: show me the the lines where store_column_name are equal to 'store01'. Also we can create another subset with stores with sales greater than 1 million – data['store_sales_column'] > 1000000
Concepts and inputs
We need to define a logical expression to filter a data frame. To build the logical expression we use logical operators: greater than '>' , less than '<', equal to '==', not equal '!='. The logical operator is used to compare two variables, a > b – means a greater than b – and so on. Be aware that the logical expressions works for the same type of variables, if you try to compare '6' == 6, the result will be False, because we are trying to compare a string vs number.
Solution
The variable mask is defined as a condition, we would like to subset sells from store B1. So, the condition will be mask
Code
Comments
Filter conditions can be multiples, if we want to filter the data base on item and store, just create two different condition and then connect it using the operator '&' (and).