2320676 - Platform - File encoding and impact on import jobs. Tips & Tricks from Support

SAP Knowledge Base Article - Public

2320676 - Platform - File encoding and impact on import jobs. Tips & Tricks from Support

Symptom

  • Opening & working on CSV files in Excel.

  • Encoding general tips and verification

  • Issues which could be related to encoding mis-matches. How to verify ?

Resolution

  • Opening & working on CSV files in Excel.


    When opening csv files in Excel you need to be mindful of the following:

    • Encoding issues :
      When opening a file in Excel, typically Excel will display the data in the ANSI format (also called Windows / ISO in SF).
      Excel will NOT auto-detect the encoding of the files. This can affect the way the content of the file is displayed if your file is not encoded using the same encoding standard.
      For example if you have special characters in your file and your file is encoded in UTF-8, then they will show up as corrupted characters when opened in Excel.

      How do I fix this ? :
      The easiest solution would be to use OpenOffice to open you files. Open Office will auto detect the encoding. For small files you can even work from a text editor (i.e Notepad++ or Notepad).

      If are restricted to Excel you can use the below KBA's to work around the behaviour.

      KBA 2115614 - How to open a text file through Microsoft Excel? - Platform Services and Tools
      KBA 2228906 - HOW TO: Save Excel Files to CSV.
      KBA 2088050 - Data Imports: Special Characters in the User Data File - Platform
      KBA 2180312 - Usernames with special characters


    • Numbers with leading zeros.
      In SuccessFactors, we accept userID's and usernames with leading zeros as valid ID's.

      For example we could have the following distinct users
      User John with UserID 000123
      User Bill with UserID 123

      If we were exporting the user data file and opening it with Excel, we would see that Excel is trying to "help" us by considering the USERID column value as a number.
      Since there is no need to display leading zeros in numbers, Excel will strip the leading zeros from John's userID, leaving us with
      User John with UserID 123
      User Bill with UserID 123

      If you changed any section in the file (changes would not even need to be related to John or Bill), saved it, and tried to reimport the new values, you would get errors as you are trying to import different users with the same userID.

      Another way this could impact you is if you were to have only one user John with UserID 000123, and reimported the file after working on it in Excel, you would be creating a NEW user John with UserID 123.

      How do I fix this ? :
      Again, we recommend you use OpenOffice if at all possible to work on your CSV files, as it does not replicate this behaviour.
      If you must edit files in Excel, you can refer to the following KBA
      KBA 2087490 - Data Imports & Exports: How do I open my User Directory in Excel without losing leading zero's in userids or usernames? - Platform

  • Encoding general tips and verification

    • How do I find which encoding my source file or exported file is using ?
      In support, we typically do this by opening the file in a text editor. Notepad++ offers an encoding menu which will highlight the current encoding of the file.
      If you don't have Notepad++ you can also verify this via the basic text editor Notepad.
      To check the encoding in Notepad, you simply need to open the file, then go to file > "save as..".
      In the save options the current encoding of the file should be displayed.

    • Does SuccessFactors recommend a particular encoding ?
      SuccessFactors supports many types of encoding, and there is no encoding better than other types. Some encoding standards provide better support for special characters then others.
      In general we would only recommend that you ensure a consistent encoding standard throughout the import process.
      For example if you have a 3rd party system providing you with a file to import users from, make sure that you have the same encoding setup for the import jobs (and when you perform manual imports).

  • Issues which could be related to encoding mis-matches. How to verify ?

    Use these steps for special characters being corrupted / not showing correctly when viewing the data in the front end.
    The approach used here is to verify the encoding at each step the import process, to ensure there is no mismatch at any stage.

    1. Your first check should be to ensure the actual source file for imported data is correct. If the characters are already garbled / corrupted there, then changing import settings or even file encoding will not resolve your problem.
      If your import file is already corrupted you need to fix it before re-importing. If the file is being provided by a 3rd party tool, contact the support team responsible for that tool to help you fix the original file. If you are creating the files manually youwill have to fix the data yourself.
      Note: Don't verify this in Excel but rather use a text editor like Notepad to look for an example user and see if the characters are broken or not.

    2. Once you have a file that you are happy with make sure to check that the encoding in the file and the encoding value selected during the import are a match.
      If they do not match the characters will get corrupted.

    3. If you confirmed that the original file is displaying fine in Notepad, and that the file encoding matches the import encoding value, then the import process should work well and overwrite the users with special characters with the correct values.
      If not, you should open a support incident to have the issue analysed.
      Make sure to provide support with the details of all the checks you performed on your end so that we can hit the ground running with the investigation.

Keywords

KBA , LOD-SF-PLT , Foundational Capabilities & Tools , LOD-SF-PLT-TLS , Tools , Problem

Product

SAP SuccessFactors HCM Core all versions