top of page

Thu, Aug 12

|

Webinar

A Program to Compare Two SAS Format Catalogs & The SAS Data Set Characterization Utility by Michael A. Raithel

Michael A. Raithel is known around the globe for his insightful SAS books and frequent presentations at SAS Global Forum and regional conferences. Don't miss this chance to see him present live for BASUG!

Registration is Closed
See other events
A Program to Compare Two SAS Format Catalogs & The SAS Data Set Characterization Utility by Michael A. Raithel
A Program to Compare Two SAS Format Catalogs & The SAS Data Set Characterization Utility by Michael A. Raithel

Time & Location

Aug 12, 2021, 12:00 PM – 1:00 PM EDT

Webinar

A Program to Compare Two SAS Format Catalogs

SAS programming professionals are sometimes faced with the task of determining the differences between two SAS format catalogs.  Perhaps they received an updated format catalog from a collaborating organization; or maybe a colleague updated a format catalog to reflect changes in the underlying data.  Either way; how can programmers tell which catalog entries and value/label pairs have been modified?  If the two catalogs being compared are relatively small, then the tried-and-true method of outputting each of them via the FMTLIB option of PROC CATALOG and then manually comparing the listings may suffice.  But, this method is laborious and error-prone when there are a large number of formats and format value/label pairs.

This paper presents a SAS program that compares two SAS format catalogs and reports the differences between them. It identifies mismatches in the format name, start value, end value, and label between the two catalogs being compared.  Because the comparisons are done programmatically, this method eliminates tedious manual reviews and directly identifies all differences.  Readers can immediately begin using this program to compare their own SAS format catalogs.

The SAS Data Set Characterization Utility

Most SAS programmers reach for two tools when they first receive a new SAS data set:  PROC CONTENTS and PROC MEANS. They use PROC CONTENTS to review the data set’s metadata; the physical attributes of the variables such as name, label, type, and length. They use PROC MEANS to determine the basic arithmetic characteristics of the numerical variables, such as the minimum, maximum, and mean values.  Doing this involves running two different SAS procedures, combing through two separate SAS-generated reports, and correlating the information about specific variables between the disparate reports.

The SAS Data Set Characterization Utility generates a single report file that contains the best of both the CONTENTS and the MEANS procedures.  It produces an Excel file with a single row of consolidated metrics for each variable found in the SAS data set.  The variable’s metrics include its key metadata attributes and—for numeric variables—its basic statistical properties.  Additionally, the report contains the number of missing values for character and for numeric variables.  Consequently, SAS programmers can utilize this utility to determine both the composition and the characteristics of a new SAS data set from a single amalgamated report.

Share This Event

Noon

July 17

Validating Data Files by

Michael A. Raithel

basug_dark_blue_1_color.png

Subscribe to our mailing list

Thanks for submitting!

  • Twitter
  • LinkedIn
sas-users-group-logo-2024.png
COPYRIGHT © 2022  THE BOSTON AREA SAS® USERS GROUP
bottom of page