Pentaho Data Integration Cookbook Second Edition is written in a cookbook format, presenting examples in the style of recipes.This allows you to go directly to your topic of interest, or follow topics throughout a chapter to gain a thorough in-depth knowledge.Pentaho Data Integration Cookbook Second Edition is designed for developers who are familiar with the basics of Kettle but who wish to move up to the next level.It is also aimed at advanced users that want to learn how to use the new features of PDI as well as and best practices for working with Kettle. Read more... Content: Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Working with Databases; Introduction; Connecting to a database; Getting data from a database; Getting data from a database by providing parameters; Getting data from a database by running a query built at runtime; Inserting or updating rows in a table; Inserting new rows where a simple primary key has to be generated; Inserting new rows where the primary key has to be generated based on stored values; Deleting data from a table Creating or altering a database table from PDI (design time)Creating or altering a database table from PDI (runtime); Inserting, deleting, or updating a table depending on a field; Changing the database connection at runtime; Loading a parent-child table; Building SQL queries via database metadata; Performing repetitive database design tasks from PDI; Chapter 2: Reading and Writing Files; Introduction; Reading a simple file; Reading several files at the same time; Reading semi-structured files; Reading files having one field per row; Reading files with some fields occupying two or more rows Writing a simple fileWriting a semi-structured file; Providing the name of a file (for reading or writing) dynamically; Using the name of a file (or part of it) as a field; Reading an Excel file; Getting the value of specific cells in an Excel file; Writing an Excel file with several sheets; Writing an Excel file with a dynamic number of sheets; Reading data from an AWS S3 Instance; Chapter 3: Working with Big Data and Cloud Sources; Introduction; Loading data into Salesforce.com; Getting data from Salesforce.com; Loading data into Hadoop; Getting data from Hadoop; Loading data into HBase Getting data from HBaseLoading data into MongoDB; Getting data from MongoDB; Chapter 4: Manipulating XML Structures; Introduction; Reading simple XML files; Specifying fields by using Path notation; Validating well-formed XML files; Validating an XML file against DTD definitions; Validating an XML file against an XSD schema; Generating a simple XML document; Generating complex XML structures; Generating an HTML page using XML and XSL transformations; Reading an RSS Feed; Generating an RSS Feed; Chapter 5: File Management; Introduction; Copying or moving one or more files Deleting one or more filesGetting files from a remote server; Putting files on a remote server; Copying or moving a custom list of files; Deleting a custom list of files; Comparing files and folders; Working with ZIP files; Encrypting and decrypting files; Chapter 6: Looking for Data; Introduction; Looking for values in a database table; Looking for values in a database with complex conditions; Looking for values in a database with dynamic queries; Looking for values in a variety of sources; Looking for values by proximity; Looking for values by using a web service Abstract: Pentaho Data Integration Cookbook Second Edition is written in a cookbook format, presenting examples in the style of recipes.This allows you to go directly to your topic of interest, or follow topics throughout a chapter to gain a thorough in-depth knowledge.Pentaho Data Integration Cookbook Second Edition is designed for developers who are familiar with the basics of Kettle but who wish to move up to the next level.It is also aimed at advanced users that want to learn how to use the new features of PDI as well as and best practices for working with Kettle The premier open source ETL tool is at your command with this recipe-packed cookbook. Learn to use data sources in Kettle, avoid pitfalls, and dig out the advanced features of Pentaho Data Integration the easy way. Overview In Detail Pentaho Data Integration is the premier open source ETL tool, providing easy, fast, and effective ways to move and transform data. While PDI is relatively easy to pick up, it can take time to learn the best practices so you can design your transformations to process data faster and more efficiently. If you are looking for clear and practical recipes that will advance your skills in Kettle, then this is the book for you. Pentaho Data Integration Cookbook Second Edition guides you through the features of explains the Kettle features in detail and provides easy to follow recipes on file management and databases that can throw a curve ball to even the most experienced developers. Pentaho Data Integration Cookbook Second Edition provides updates to the material covered in the first edition as well as new recipes that show you how to use some of the key features of PDI that have been released since the publication of the first edition. You will learn how to work with various data sources from relational and NoSQL databases, flat files, XML files, and more. The book will also cover best practices that you can take advantage of immediately within your own solutions, like building reusable code, data quality, and plugins that can add even more functionality. Pentaho Data Integration Cookbook Second Edition will provide you with the recipes that cover the common pitfalls that even seasoned developers can find themselves facing. You will also learn how to use various data sources in Kettle as well as advanced features. What you will learn from this book Approach Pentaho Data Integration Cookbook Second Edition is written in a cookbook format, presenting examples in the style of recipes.This allows you to go directly to your topic of interest, or follow topics throughout a chapter to gain a thorough in-depth knowledge. Who this book is written for Pentaho Data Integration Cookbook Second Edition is designed for developers who are familiar with the basics of Kettle but who wish to move up to the next level.It is also aimed at advanced users that want to learn how to use the new features of PDI as well as and best practices for working with Kettle. The premier open source ETL tool is at your command with this recipe-packed cookbook. Learn to use data sources in Kettle, avoid pitfalls, and dig out the advanced features of Pentaho Data Integration the easy way. Intergrate Kettle in integration with other components of the Pentaho Business Intelligence Suite, to build and publish Mondrian schemas,create reports, and populatedashboards This book contains an organized sequence of recipes packed with screenshots, tables, and tips so you can complete the tasks as efficiently as possible Manipulate your data by exploring, transforming, validating, integrating, and performing data analysis In Detail Pentaho Data Integration is the premier open source ETL tool, providing easy, fast, and effective ways to move and transform data. While PDI is relatively easy to pick up, it can take time to learn the best practices so you can design your transformations to process data faster and more efficiently. If you are looking for clear and practical recipes that will advance your skills in Kettle, then this is the book for you. Pentaho Data Integration Cookbook Second Edition guides you through the features of explains the Kettle features in detail and provides easy to follow recipes on file management and databases that can throw a curve ball to even the most experienced developers. Pentaho Data Integration Cookbook Second Edition provides updates to the material covered in the first edition as well as new recipes that show you how to use some of the key features of PDI that have been released since the publication of the first edition. You will learn how to work with various data sources – from relational and NoSQL databases, flat files, XML files, and more. The book will also cover best practices that you can take advantage of immediately within your own solutions, like building reusable code, data quality, and plugins that can add even more functionality. Pentaho Data Integration Cookbook Second Edition will provide you with the recipes that cover the common pitfalls that even seasoned developers can find themselves facing. You will also learn how to use various data sources in Kettle as well as advanced features In Detail Pentaho Data Integration is the premier open source ETL tool, providing easy, fast, and effective ways to move and transform data. While PDI is relatively easy to pick up, it can take time to learn the best practices so you can design your transformations to process data faster and more efficiently. If you are looking for clear and practical recipes that will advance your skills in Kettle, then this is the book for you. Pentaho Data Integration Cookbook Second Edition guides you through the features of explains the Kettle features in detail and provides easy to follow recipes on file management and databases that can throw a curve ball to even the most experienced developers. Pentaho Data Integration Cookbook Second Edition provides updates to the material covered in the first edition as well as new recipes that show you how to use some of the key features of PDI that have been released since the publication of the first edition. You will learn how to work with various data sources {u2013} from relational and NoSQL databases, flat files, XML files, and more. The book will also cover best practices that you can take advantage of immediately within your own solutions, like building reusable code, data quality, and plugins that can add even more functionality. Pentaho Data Integration Cookbook Second Edition will provide you with the recipes that cover the common pitfalls that even seasoned developers can find themselves facing. You will also learn how to use various data sources in Kettle as well as advanced features.Approach Pentaho Data Integration Cookbook Second Edition is written in a cookbook format, presenting examples in the style of recipes.This allows you to go directly to your topic of interest, or follow topics throughout a chapter to gain a thorough in-depth knowledge.Who this book is for Pentaho Data Integration Cookbook Second Edition is designed for developers who are familiar with the basics of Kettle but who wish to move up to the next level.It is also aimed at advanced users that want to learn how to use the new features of PDI as well as and best practices for working with Kettle Alex Meadows, Adrián Sergio Pulvirenti, Maria Carina Roldán. Over 100 Recipes For Building Open Source Etl Solutions With Pentaho Data Integration--cover. Includes Bibliographical References And Index.