{"id":19,"date":"2023-09-13T09:19:39","date_gmt":"2023-09-13T09:19:39","guid":{"rendered":"http:\/\/harvard-open-data-project.local\/?page_id=19"},"modified":"2023-09-13T09:19:41","modified_gmt":"2023-09-13T09:19:41","slug":"python-data-wrangling","status":"publish","type":"page","link":"http:\/\/harvard-open-data-project.local\/python-data-wrangling\/","title":{"rendered":"Python Data Wrangling"},"content":{"rendered":"\n
This article is an introduction to Python for beginners, with the aim of equipping you with the basic knowledge and tools you need to explore and solve more complex problems.<\/p>\n\n\n\n
In Python, a variable is a placeholder for storing a value. The value of a variable can be changed, and it can be of different types, such as integers, floats, strings, or booleans. To print the value of a variable, you use the There are four commonly used types of variables in Python:<\/p>\n\n\n\n In Python, strings are sequences of characters. They are defined by enclosing the characters in single (\u2019 ‘) or double (” “) quotes. Strings in Python are treated as lists, and therefore, you can perform operations on them like you would on a list. If you want to include quotation marks within a string, you can do so by using a backslash ( Moreover, you can perform indexing on strings, which means you can access any character in a string by referring to its position inside the string. Remember that Python indexes start at 0.<\/p>\n\n\n Python also provides various methods to manipulate strings. For example, you can convert a string to lowercase or uppercase using the You can also split a string into a list of substrings using the Lists in Python are used to store multiple items in a single variable. Lists are ordered, mutable, and allow duplicate values. You can access elements in a list by referring to their index number:<\/p>\n\n\n You can add elements to a list using the Python also provides several methods to remove elements from a list. You can use the Python uses control flow tools such as
For instance, if you want to declare a variable y<\/code> with a value of
3.45<\/code>, you would use the following Python code:<\/p>\n\n\n
y = 3.45\n<\/code><\/span><\/pre>\n\n\n
print<\/code> function:<\/p>\n\n\n
print<\/span>(y) # this will print 3.45<\/span>\n<\/code><\/span>Code language:<\/span> PHP<\/span> (<\/span>php<\/span>)<\/span><\/small><\/pre>\n\n\n
\n
Integers (int)<\/code>: These are whole numbers such as 1, 2, 3, and so on.<\/li>\n\n\n\n
Strings (str)<\/code>: These are sequences of characters enclosed in single or double quotes (\u2018hello\u2019, \u201cworld\u201d).<\/li>\n\n\n\n
Booleans (bool)<\/code>: These are two possible values, True or False.<\/li>\n\n\n\n
Floats (float)<\/code>: These are real numbers with a decimal point such as 3.14159.
You can convert from one type to another using the built-in Python functions int()<\/code>,
str()<\/code>,
bool()<\/code>, and
float()<\/code>.
Python also allows various operations on variables, such as addition, subtraction, multiplication, division, and modulus among others. For example:<\/li>\n<\/ol>\n\n\nx = 7<\/span>\ny = 2<\/span>\nprint<\/span>(x+y) # Addition; this will print 9<\/span>\nprint<\/span>(x-y) # Subtraction; this will print 5<\/span>\nprint<\/span>(x\/y) # Division; this will print 3.5<\/span>\nprint<\/span>(x%y) # Modulus; this will print 1<\/span>\nprint<\/span>(x**y) # Exponentiation; this will print 49<\/span>\n<\/code><\/span>Code language:<\/span> PHP<\/span> (<\/span>php<\/span>)<\/span><\/small><\/pre>\n\n\n
Working with Strings<\/h2>\n\n\n\n
You can concatenate, or join, two strings using the +<\/code> operator:<\/p>\n\n\n
print<\/span>(\"Hello\"<\/span> + \" \"<\/span> + \"World\"<\/span>) # this will print \"Hello World\"<\/span>\n<\/code><\/span>Code language:<\/span> PHP<\/span> (<\/span>php<\/span>)<\/span><\/small><\/pre>\n\n\n
\\<\/code>) before the quotation mark:<\/p>\n\n\n
print<\/span>(\"He said, \\\"Hello.\\\"\"<\/span>) # this will print He said, \"Hello.\"<\/span>\n<\/code><\/span>Code language:<\/span> PHP<\/span> (<\/span>php<\/span>)<\/span><\/small><\/pre>\n\n\n
my_string = \"Hello World\"<\/span>\nprint<\/span>(my_string[0<\/span>]) # this will print 'H'<\/span>\nprint<\/span>(my_string[-1<\/span>]) # this will print 'd'<\/span>\n<\/code><\/span>Code language:<\/span> PHP<\/span> (<\/span>php<\/span>)<\/span><\/small><\/pre>\n\n\n
lower()<\/code> and
upper()<\/code> methods, respectively:<\/p>\n\n\n
my_string = \"Hello World\"<\/span>\nprint<\/span>(my_string.lower()) # this will print 'hello world'<\/span>\nprint<\/span>(my_string.upper()) # this will print 'HELLO WORLD'<\/span>\n<\/code><\/span>Code language:<\/span> PHP<\/span> (<\/span>php<\/span>)<\/span><\/small><\/pre>\n\n\n
split()<\/code> method:<\/p>\n\n\n
my_string = \"Hello World\"<\/span>\nprint<\/span>(my_string.split(\" \"<\/span>)) # this will print ['Hello', 'World']<\/span>\n<\/code><\/span>Code language:<\/span> PHP<\/span> (<\/span>php<\/span>)<\/span><\/small><\/pre>\n\n\n
Understanding Lists and List Methods<\/h2>\n\n\n\n
To declare a list, you use square brackets []<\/code>:<\/p>\n\n\n
my_list = [\"apple\"<\/span>, \"banana\"<\/span>, \"cherry\"<\/span>]\n<\/code><\/span>Code language:<\/span> JavaScript<\/span> (<\/span>javascript<\/span>)<\/span><\/small><\/pre>\n\n\n
print<\/span>(my_list[0<\/span>]) # this will print 'apple'<\/span>\n<\/code><\/span>Code language:<\/span> PHP<\/span> (<\/span>php<\/span>)<\/span><\/small><\/pre>\n\n\n
append()<\/code> method, or insert an element at a specific position using the
insert()<\/code> method:<\/p>\n\n\n
my_list<\/span>.append<\/span>(\"dragonfruit<\/span>\")\nprint<\/span>(my_list<\/span>) # this<\/span> will<\/span> print<\/span> ['apple'<\/span>, 'banana'<\/span>, 'cherry'<\/span>, 'dragonfruit'<\/span>]<\/span>\nmy_list<\/span>.insert<\/span>(1, \"mango<\/span>\")\nprint<\/span>(my_list<\/span>) # this<\/span> will<\/span> print<\/span> ['apple'<\/span>, 'mango'<\/span>, 'banana'<\/span>, 'cherry'<\/span>, 'dragonfruit'<\/span>]<\/span>\n<\/code><\/span>Code language:<\/span> CSS<\/span> (<\/span>css<\/span>)<\/span><\/small><\/pre>\n\n\n
remove()<\/code> method to remove a specific item, or the
pop()<\/code> method to remove an item at a specifiedindex.<\/p>\n\n\n
my_list<\/span>.remove<\/span>(\"banana<\/span>\")\nprint<\/span>(my_list<\/span>) # this<\/span> will<\/span> print<\/span> ['apple'<\/span>, 'mango'<\/span>, 'cherry'<\/span>, 'dragonfruit'<\/span>]<\/span>\nmy_list<\/span>.pop<\/span>(1)\nprint<\/span>(my_list<\/span>) # this<\/span> will<\/span> print<\/span> ['apple'<\/span>, 'cherry'<\/span>, 'dragonfruit'<\/span>]<\/span>\n<\/code><\/span>Code language:<\/span> CSS<\/span> (<\/span>css<\/span>)<\/span><\/small><\/pre>\n\n\n
Control Flow and Functions<\/h2>\n\n\n\n
if<\/code>,
for<\/code>, and
while<\/code> statements for handling conditions and looping through code.
An if<\/code> statement is used to test a condition. If the condition is true, Python executes the block of code inside the
if<\/code> statement:<\/p>\n\n\n
x = 10<\/span>\nif<\/span> x > 5<\/span>:\n print<\/span>(\"x is greater than 5\"<\/span>) # this will be printed as x is indeed greater than 5<\/span>\n<\/code><\/span>Code language:<\/span> PHP<\/span> (<\/span>php<\/span>)<\/span><\/small><\/pre>\n\n\n
For<\/code> and