Cloud Data Integration: 2013

Sunday, December 1, 2013

6 Steps to Connecting your Database with Salesforce.com using Informatica Cloud

Step - 1 Define the task and select the task operation.

Step - 2 Configure the Database Source and Select the table from which data needs to be read.

Step - 3 Configure the Salesforce Target and select the appropriate object in which data needs to be loaded.

Step - 4 Define data filters that will be applied on the Source.

Step -5 Configure Mappings and Expressions. Drag a field from Source and drop it onto the target.

Step -6 Schedule the task.

References:
https://community.informatica.com/servlet/JiveServlet/previewBody/2233-102-1-2479/6steps.pdf

Wednesday, April 10, 2013

How to use Salesforce to Salesforce Automation with Informatica Cloud

Salesforce to Salesforce is a native force.com feature to share data records in real time between two Force.com environments (orgs). For example, two business partners may want to collaborate by sharing accounts and opportunity data within their orgs. It is very easy to share the data using Salesforce to Salesforce feature.

In most scenarios, data can be shared using the standard Salesforce.com user interface manually. A user creates an account in one org, clicks the external sharing button, selects the appropriate org connection and then shares the record. This involves some manual effort and when sales reps are working on lot of records, the manual sharing process can become a painful exercise and that may cause some user adoption issues in the long run.

There are two objects which are controlling the Salesforce to Salesforce feature at the backend:

PartnerNetworkConnection - Represents a Salesforce to Salesforce connection between Salesforce organizations
PartnerNetworkRecordConnection – Represents a record shared between two Salesforce Orgs using Salesforce to Salesforce.

Whenever the user shares a record using Salesforce to Salesforce, a record gets created in PartnerNetworkRecordConnection object.

Informatica Cloud can be used to make this integration between two Salesforce orgs seamless and automatic. The process flow will be as follows:

User creates the records (which need to be shared) in his org.
As soon as the record is created, a workflow is kicked off which sends an outbound message.
This outbound message will initiate an Informatica Cloud task.
The Informatica Cloud task will insert the record details (created in setp -1) in the PartnerNetworkRecordConnection object.
As soon as the record gets created in PartnerNetworkRecordConnection, it is shared with the Partner Org.

Thus, the records created in one org are automatically shared to the Partner Org in near real time without any manual intervention.

Overall, the benefits of using Informatica Cloud are:

Data Synchronization achieved through the use of out of the box functionality.
Centralizes Integration Logic and no need of writing custom VF/Apex code for automating Salesforce to Salesforce feature.
Record Selection criteria for sharing can be easily defined using filters.
Exception handling of failed records can be set up to monitor the record sharing status.
Scalable approach.

So if you are considering implementing Salesforce to Salesforce feature to share records automatically, you can consider the Informatica Cloud approach as described above.

Feel free to reach out to comment for any clarification or questions.

Thursday, February 7, 2013

Integration and Analytics in the Cloud (100% Cloud)

I work as an Integration Architect.

Few years ago, life of senior management guys was not easy. One of my senior manager had to run a BI report everyday in the morning at 7 AM for doing some analysis and decision making. So he used to reach office before 7 AM, start his workstation, open the appropriate tools and then run the report. And this whole process used to take good 30 minutes to an hour. Apart from it, he used to have concerns regarding elasticity, availability, flexibility and cost of the tools and technologies.

He used to envision that one day he would be able to do this process by click of a button from anywhere, anytime. “No constraints whatsoever.”

We did an integration implementation for a large Canadian Telecom Giant. The name of the project was Marketing Data Mart.

The Marketing Data Mart consisted of an integrated architecture of heterogeneous data stores and technologies to support the ultimate analysis of data.

We needed to integrate the data from the following source systems:

Salesforce.com
Eloqua
Harte Hanks
Dun and Bradstreet Optimizer
Jigsaw Dun and Bradstreet Contacts

To make it happen, we had used the following tools and technologies:

Informatica Cloud - cloud based integration tool (http://www.informaticacloud.com/)
Amazon EC2 (Elastic Cloud Compute) – Cloud based hosting (http://aws.amazon.com/ec2/)
Amazon RDS (Relational Database Service) – Cloud based database (http://aws.amazon.com/rds/)
GoodData – Cloud based reporting and analytics (http://www.gooddata.com/)

Data from all the source systems was loaded and transformed in Amazon RDS and this data was fed into GoodData which enabled complex and analytical reports creation.

There were some initial challenges while configuring the Informatica on Amazon EC2, setting up secure FTP on Amazon EC2 and configuring Amazon RDS and GoodData because of our minimal exposure on these technologies, but we had the vision in front of us that enabled us to overcome all the hurdles and implement the entire integration on cloud. By cloud, I mean 100% on Cloud.

Some of the salient features are:

The complete integration was implemented on 100% cloud based technologies.
Informatica Cloud was configured on Amazon EC2 UNIX instance successfully.
Data Volumes to the tune of 2-3 million records were integrated successfully.
83 separate tables in Amazon RDS, containing data from 6 source systems, are part of the data mart solution.
Complex analytical reports and dashboards were generated using GoodData.
The client previously had to use 3+ separate systems to get reports which then had to be consolidated via spreadsheets & other tools. The reporting from GoodData is a one-stop shop for reporting across multiple systems, all accessible via a web browser. For deeper dives into the data, using sophisticated SQL queries, the client can run reports on the Amazon RDS database.
There was no compromise on the security aspect and the data of the client was stored in highly secure cloud platform.
Amazon EC2 and RDS are highly scalable and there are no concerns with respect to availability and flexibility

We have successfully proved that cloud technologies can be used for complex integrations and now senior managers can feel relieved as they can run the BI reports by click of a button anytime, anywhere.

Friday, January 18, 2013

Data Quality Myths – Understand and Save money

“The data stored in my systems has tremendous potential. But somehow, I am not able to unlock its true value”

Are you also facing the above problem?

Data Quality Issues are the major roadblocks which prevent enterprises to realize the true potential of their data.

The 1-10-100 Rule of total quality management is very much applicable for Data Quality.

It takes $ 1 per record to verify quality of data upfront (prevention), $ 10 to cleanse and de-dupe (correction) and $ 100 per record if nothing is done- ramification of mistakes felt over and over again (failure).

Now that you understand the 1-10-100 rule, let’s look at some of the myths related to data quality which have been followed by many enterprises.

Myth 1 - My data is accurate as I have been using this for years without any problems.

Most people believe that if there are no reported problems or issues, their data is accurate. But have they realized lately that they may have missed many business opportunities which did not substantiated because of bad data? The worst part is that they have no clue about those missed chances.

A recent report from Artemis Ventures indicated that poor data quality costs the United States economy roughly $3.1 trillion per year. To provide some perspective on this unimaginably large figure, that’s twice the size of the US Federal deficit. An estimate from the US Insurance Data Management Association puts the cost of poor quality data at 15% to 20% of corporations’ operating revenue.

Myth 2 - I am getting my data enriched regularly and paying per record for enrichment.

There are various vendors that provide data enrichment services and charge on a per record basis. Let’s assume you are sending 100,000 records and the vendor is charging 30 cents per record, then the total cost of data enrichment is $ 30,000. At a later stage, you realized that 40,000 records were duplicates and ideally, these records should not have been provided for enrichment and that may have saved $12,000.

Generally, the clients send the data to vendor at periodic intervals and they may be losing huge amount of money every time.

Myth 3 - I have been using my data for regulatory compliance and there have been no issues lately.

Pharmaceutical and financial institutions need to provide data to regulators for regulatory compliance. It is a critical task and slightest of the non-compliance can result in serious financial and legal implications. In such situations where stakes are high, one should not wait for issues to arise; they should proactively look for various measures to prevent Data Quality issues.

A report recently issued by Aberdeen Research indicates that almost half of finance employees are “challenged by the fact that their organizations are leveraging risk and compliance data in different formats, making it difficult to compare data.” According to the report, complying with regulations is a key concern for CFOs. And a distressing number of respondents indicated that the existing IT infrastructure is lacking in the advanced capabilities needed to support governance, risk and compliance (GRC) initiatives.

Data has always been the king. The sooner you realize this, the greater you are expected to save.

The best there is, the best there was and the best there ever will be - and that is Data Quality.

It takes just a tiny bit of invalid or bad information to create monumental issues. Bad data multiplies at an exponential rate, corrupting not only the system in which it originates, but also the many other data sources it interacts as it moves across the business. Thus, the longer a company waits to detect and correct a bad record, the more and severe damage it can do.

Thus, there is a need to establish a Data Governance framework - a combination of disciplines, enhanced processes and the right mix of tools and technology addressing the critical data issues that will drive the biggest returns, resulting in clean data that deliver results and information that is accurate.

References:

http://disastermapping.wordpress.com/2012/02/16/the-costs-of-data-quality-failure-2/

http://blog.match2lists.com/general-information/the-costs-of-data-quality-failure/

http://www.accountancysa.org.za/resources/ShowItemArticle.asp?Article=Data+and+Regulation%3A+Compliance&ArticleId=2398&Issue=1113

Pages