공부하세요

[SAS] compress

김영국 2008. 2. 27. 18:06

compress 에 modifier를 사용하면 간단하게 쓸데 없는 특수문자, 탭문자 등을 없앨 수 있다.

 * 특히 특수기호 한방에 없애려면 'p' 옵션 사용;


COMPRESS Function



Removes specific characters from a character string
Category: Character

Syntax
Arguments
Details
Examples
Example 1: Compressing Blanks
Example 2: Compressing Lowercase Letters
Example 3: Compressing Tab Characters
Example 4: Keeping Characters in the List
See Also

Syntax

COMPRESS(<source><, chars><, modifiers>)


Arguments

source

specifies a source string that contains characters to remove.

chars

specifies a character string that initializes a list of characters. By default, the characters in this list are removed from the source. If you specify the "K" modifier in the third argument, then only the characters in this list are kept in the result.

Tip: You can add more characters to this list by using other modifiers in the third argument.
Tip: Enclose a literal string of characters in quotation marks.
modifiers

specifies a character string in which each character modifies the action of the COMPRESS function. Blanks are ignored. These are the characters that can be used as modifiers:

a or A

adds letters of the Latin alphabet (A - Z, a - z) to the list of characters.

c or C

adds control characters to the list of characters.

d or D

adds numerals to the list of characters.

f or F

adds the underscore character and letters of the Latin alphabet (A - Z, a - z) to the list of characters.

g or G

adds graphic characters to the list of characters.

i or I

ignores the case of the characters to be kept or removed.

k or K

keeps the characters in the list instead of removing them.

l or L

adds lowercase letters (a - z) to the list of characters.

n or N

adds numerals, the underscore character, and letters of the Latin alphabet (A - Z, a - z) to the list of characters.

o or O

processes the second and third arguments once rather than every time the COMPRESS function is called. Using the "O" modifier in the DATA step (excluding WHERE clauses) or the SQL procedure can make COMPRESS run much faster when you call it in a loop where the second and third arguments do not change.

p or P

adds punctuation marks to the list of characters.

s or S

adds space characters to the list of characters (blank, horizontal tab, vertical tab, carriage return, line feed, and form feed).

t or T

trims trailing blanks from the first and second arguments.

u or U

adds uppercase letters (A - Z) to the list of characters.

w or W

adds printable characters to the list of characters ("W" for writable, because "P" is used for punctuation).

x or X

adds hexadecimal characters to the list of characters.


Details

The COMPRESS function allows null arguments. A null argument is treated as a string that has a length of zero.

Based on the number of arguments, the COMPRESS functions works as follows:

If you call the COMPRESS function with... The result is...
only the first argument, source The argument with all blanks removed. If the argument is completely blank, then the result is a string with a length of zero. If you assign the result to a character variable with a fixed length, then the value of that variable will be padded with blanks to fill its defined length.
the first two arguments, source and chars All characters that appear in the second argument are removed from the result.
three arguments, source, chars, and modifiers The "k" or "K" modifier (specified in the third argument) determines whether the characters in the second argument are kept or removed from the result.

The COMPRESS function compiles a list of characters to keep or remove, comprising the characters in the second argument plus any types of characters that are specified by the modifiers. For example, the "d" or "D" modifier specifies digits. Both of the following function calls remove digits from the result:

COMPRESS(source, "1234567890");
COMPRESS(source, , "d");

To remove digits and plus or minus signs, you could use either of the following function calls:

COMPRESS(source, "1234567890+-");
COMPRESS(source, "+-", "d");

If the COMPRESS function returns a value to a variable that has not yet been assigned a length, then by default the variable length is determined by the length of the first argument.


Examples


Example 1: Compressing Blanks

SAS Statements Results

----+----1
a='AB C D ';
b=compress(a);
put b;
 
ABCD


Example 2: Compressing Lowercase Letters

SAS Statements Results

----+----1----+----2----+----3
x='123-4567-8901 B 234-5678-9012 c';
y=compress(x,'ABCD','l');
put y;
 
123-4567-8901 234-5678-9012


Example 3: Compressing Tab Characters

SAS Statements Results

----+----1
x='1    2    3    4    5';
y=compress(x,,'s');
put y;
 
12345


Example 4: Keeping Characters in the List

SAS Statements Results

----+----1
x='Math A English B Physics A';
y=compress(x,'ABCD','k');
put y;
 
ABA