Business users familiar with Base SAS programming can now learn Python by example. You will learn via examples that map SAS programming constructs and coding patterns into their Python equivalents. Your primary focus will be on pandas and data management issues related to analysis of data. It is estimated that there are three million or more SAS users worldwide today. As the data science landscape shifts from using SAS to open source software such as Python, many users will feel the need to update their skills. Most users are not formally trained in computer science and have likely acquired their skills programming SAS as part of their job. As a result, the current documentation and plethora of books and websites for learning Python are technical and not geared for most SAS users. __**Python for SAS Users**__ provides the most comprehensive set of examples currently available. It contains over 200 Python scripts and approximately 75 SAS programs that are analogs to the Python scripts. The first chapters are more Python-centric, while the remaining chapters illustrate SAS and corresponding Python examples to solve common data analysis tasks such as reading multiple input sources, missing value detection, imputation, merging/combining data, and producing output. This book is an indispensable guide for integrating SAS and Python workflows.**What You’ll Learn** * Quickly master Python for data analysis without using a trial-and-error approach * Understand the similarities and differences between Base SAS and Python * Better determine which language to use, depending on your needs * Obtain quick results **Who This Book Is For** SAS users, SAS programmers, data scientists, data scientist leaders, and Python users who need to work with SAS Table of Contents 4 About the Authors 9 About the Technical Reviewers 11 Acknowledgments 12 Introduction 13 Chapter 1: Why Python? 16 Setting Up a Python Environment 17 Anaconda3 Install Process for Windows 18 Troubleshooting Python Installation for Windows 24 Anaconda3 Install Process for Linux 28 Executing a Python Script on Windows 31 Case Sensitivity 34 Line Continuation Symbol 34 Executing a Python Script on Linux 35 Integrated Development Environment (IDE) for Python 36 Jupyter Notebook 37 Jupyter Notebook for Linux 39 Summary 40 Chapter 2: Python Types and Formatting 41 Numerics 42 Python Operators 44 Boolean 45 Comparison Operators 46 IN/NOT IN 51 AND/OR/NOT 52 Numerical Precision 54 Strings 58 String Slicing 61 Formatting 65 Formatting Strings 65 Formatting Integers 68 Formatting Floats 72 Datetime Formatting 73 Summary 77 Chapter 3: pandas Library 78 Column Types 80 Series 81 DataFrames 86 DataFrame Validation 88 DataFrame Inspection 91 Missing Data 96 Missing Value Detection 99 isnull() Method 103 Dropping Missing Values 110 Imputation 117 Summary 122 Chapter 4: Indexing and GroupBy 123 Create Index 124 Return Columns by Position 126 Return Rows by Position 129 Return Rows and Columns by Label 131 Conditionals 135 Updating 138 Return Rows and Columns by Position 140 MultiIndexing 143 Basic Subsets with MultiIndexes 149 Advanced Indexing with MultiIndexes 153 Slicing Rows and Columns 155 Conditional Slicing 158 Cross Sections 160 GroupBy 162 Iteration Over Groups 167 GroupBy Summary Statistics 171 Filtering by Group 173 Group by Column with Continuous Values 174 Transform Based on Group Statistic 177 Pivot 180 Summary 188 Chapter 5: Data Management 189 SAS Sort/Merge 193 Inner Join 196 Right Join 198 Left Join 201 Outer Join 203 Right Join Unmatched Keys 204 Left Join Unmatched Keys 207 Outer Join Unmatched Keys 209 Validate Keys 212 Joining on an Index 213 Join Key Column with an Index 215 Update 217 Conditional Update 221 Concatenation 225 Finding Column Min and Max Values 234 Sorting 235 Finding Duplicates 239 Dropping Duplicates 240 Sampling 243 Convert Types 246 Rename Columns 247 Map Column Values 247 Transpose 249 Summary 253 Chapter 6: pandas Readers and Writers 254 Reading .csv Files 255 Date Handling in .csv Files 261 Read .xls Files 264 Write .csv Files 271 Write .xls Files 273 Read JSON 275 Write JSON 279 Read RDBMS Tables 280 Query RDBMS Tables 290 Read SAS Datasets 297 Write RDBMS Tables 300 Summary 305 Chapter 7: Date and Time 306 Date Object 306 Return Today’s Date 307 Date Manipulation 310 Shifting Dates 319 Date Formatting 320 Dates to Strings 324 Strings to Dates 327 Time Object 329 Time of Day 332 Time Formatting 334 Times to Strings 335 Strings to Time 337 Datetime Object 340 Combining Times and Dates 343 Returning Datetime Components 345 Strings to Datetimes 347 Datetimes to Strings 350 Timedelta Object 353 Time zone Object 362 Naïve and Aware Datetimes 363 pytz Library 366 SAS Time zone 374 Summary 383 Chapter 8: SASPy Module 384 Install SASPy 384 Set Up the sascfg_personal.py Configuration File 385 Make SAS-Supplied .jar Files Available 387 SASPy Examples 389 Basic Data Wrangling 391 Write DataFrame to SAS Dataset 394 Define the Libref to Python 395 Write the DataFrame to a SAS Dataset 396 Execute SAS Code 402 Write SAS Dataset to DataFrame 404 Passing SAS Macro Variables to Python Objects 408 Prompting 411 Scripting SASPy 412 Datetime Handling 415 Summary 420 Appendix A: Generating the Tickets DataFrame 421 Appendix B: Many-to-Many Use Case 424 Index 433 Business users familiar with Base SAS programming can now learn Python by example. You will learn via examples that map SAS programming constructs and coding patterns into their Python equivalents. Your primary focus will be on pandas and data management issues related to analysis of data. It is estimated that there are three million or more SAS users worldwide today. As the data science landscape shifts from using SAS to open source software such as Python, many users will feel the need to update their skills. Most users are not formally trained in computer science and have likely acquired their skills programming SAS as part of their job. As a result, the current documentation and plethora of books and websites for learning Python are technical and not geared for most SAS users. Python for SAS Users provides the most comprehensive set of examples currently available. It contains over 200 Python scripts and approximately 75 SAS programs that are analogs to the Python scripts. The first chapters are more Python-centric, while the remaining chapters illustrate SAS and corresponding Python examples to solve common data analysis tasks such as reading multiple input sources, missing value detection, imputation, merging/combining data, and producing output. This book is an indispensable guide for integrating SAS and Python workflows. What You’ll Learn Quickly master Python for data analysis without using a trial-and-error approach Understand the similarities and differences between Base SAS and Python Better determine which language to use, depending on your needs Obtain quick results Who This Book Is For SAS users, SAS programmers, data scientists, data scientist leaders, and Python users who need to work with SAS Business users familiar with Base SAS programming can now learn Python by example. You will learn via examples that map SAS programming constructs and coding patterns into their Python equivalents. Your primary focus will be on pandas and data management issues related to anlaysis of data. It is estimated that there are three million or more SAS users worldwide today. As the data science landscape shifts from using SAS to open source software such as Python, many users will feel the need to update their skills. Most users are not formally trained in computer science and have likely acquired their skills programming SAS as part of their job. As a result, the current documentation and plethora of books and websites for learning Python are technical and not geared for most SAS users. "Pyhon for SAS users" provides the most comprehensive set of examples currently available. It contains over 200 Python scripts and approximately 75 SAS programs that are analogs to the Python scripts. The first chapters are more Python-centric, while the remaining chapters illustrate SAS and corresponding Python examples to solve common data analysis tasks such as reading multiple input sources, missing value detection, imputation, merging/combining data, and producing output. This book is an indispensable guide for integrating SAS and Python workflows Front Matter ....Pages i-xvii Why Python? (Randy Betancourt, Sarah Chen)....Pages 1-25 Python Types and Formatting (Randy Betancourt, Sarah Chen)....Pages 27-63 pandas Library (Randy Betancourt, Sarah Chen)....Pages 65-109 Indexing and GroupBy (Randy Betancourt, Sarah Chen)....Pages 111-176 Data Management (Randy Betancourt, Sarah Chen)....Pages 177-241 pandas Readers and Writers (Randy Betancourt, Sarah Chen)....Pages 243-294 Date and Time (Randy Betancourt, Sarah Chen)....Pages 295-372 SASPy Module (Randy Betancourt, Sarah Chen)....Pages 373-409 Back Matter ....Pages 411-434