{x}
blog image

Number of Valid Words in a Sentence

A sentence consists of lowercase letters ('a' to 'z'), digits ('0' to '9'), hyphens ('-'), punctuation marks ('!', '.', and ','), and spaces (' ') only. Each sentence can be broken down into one or more tokens separated by one or more spaces ' '.

A token is a valid word if all three of the following are true:

  • It only contains lowercase letters, hyphens, and/or punctuation (no digits).
  • There is at most one hyphen '-'. If present, it must be surrounded by lowercase characters ("a-b" is valid, but "-ab" and "ab-" are not valid).
  • There is at most one punctuation mark. If present, it must be at the end of the token ("ab,", "cd!", and "." are valid, but "a!b" and "c.," are not valid).

Examples of valid words include "a-b.", "afad", "ba-c", "a!", and "!".

Given a string sentence, return the number of valid words in sentence.

 

Example 1:

Input: sentence = "cat and  dog"
Output: 3
Explanation: The valid words in the sentence are "cat", "and", and "dog".

Example 2:

Input: sentence = "!this  1-s b8d!"
Output: 0
Explanation: There are no valid words in the sentence.
"!this" is invalid because it starts with a punctuation mark.
"1-s" and "b8d" are invalid because they contain digits.

Example 3:

Input: sentence = "alice and  bob are playing stone-game10"
Output: 5
Explanation: The valid words in the sentence are "alice", "and", "bob", "are", and "playing".
"stone-game10" is invalid because it contains digits.

 

Constraints:

  • 1 <= sentence.length <= 1000
  • sentence only contains lowercase English letters, digits, ' ', '-', '!', '.', and ','.
  • There will be at least 1 token.

Solution Explanation: Counting Valid Words in a Sentence

This problem requires us to analyze a sentence and count the number of valid words within it. A valid word adheres to three rules:

  1. Character Set: Contains only lowercase letters, hyphens (-), and punctuation marks (!, ., ,). No digits are allowed.

  2. Hyphen Rule: At most one hyphen is permitted, and it must be surrounded by lowercase letters (e.g., "a-b" is valid, but "-ab" and "ab-" are not).

  3. Punctuation Rule: At most one punctuation mark is allowed, and it must be at the end of the word (e.g., "ab," is valid, but "a!b" is not).

The solution employs a function to validate individual words against these rules, and then iterates through the sentence to count the valid ones.

Approach:

  1. Sentence Splitting: The input sentence is first split into individual words using whitespace as the delimiter. Many languages provide built-in functions for this (e.g., sentence.split() in Python, sentence.split(" ") in Java, strings.Fields() in Go).

  2. Word Validation: A core function, check (or its equivalent in different languages), is designed to verify each word based on the three rules. This function iterates through the word character by character:

    • It immediately rejects words containing digits.
    • It checks for punctuation marks, ensuring they are at the end if present.
    • It manages hyphen occurrences, validating its position and letter-surrounded condition.
  3. Counting Valid Words: After validation, the check function returns 1 if the word is valid, and 0 otherwise. The main function sums up these return values to get the total count of valid words.

Time and Space Complexity Analysis:

  • Time Complexity: O(n), where n is the length of the sentence. This is because the sentence is traversed once to split it into words, and each word is then traversed at most once during validation. The check function has a linear time complexity with respect to the length of the word.

  • Space Complexity: O(n) in the worst case. This is because in the worst-case scenario, all characters in the sentence could be distinct, leading to a space usage proportional to n (for storing the words in an array, list, or similar data structure).

Code Examples (with explanations):

Python:

class Solution:
    def countValidWords(self, sentence: str) -> int:
        def check(word: str) -> bool:
            #Check for digits
            if any(c.isdigit() for c in word):
                return False
 
            hyphen_count = word.count('-')
            if hyphen_count > 1:
                return False
            if '-' in word:
                hyphen_index = word.index('-')
                if hyphen_index == 0 or hyphen_index == len(word) - 1 or not word[hyphen_index - 1].isalpha() or not word[hyphen_index + 1].isalpha():
                    return False
 
            punctuation_count = sum(1 for c in word if c in "!.,")
            if punctuation_count > 1:
                return False
            if any(c in "!.,." for c in word[:-1]): #Check for punctuation not at end
                return False
            return True
 
        words = sentence.split()
        valid_word_count = sum(1 for word in words if check(word))
        return valid_word_count
 

Java:

class Solution {
    public int countValidWords(String sentence) {
        String[] words = sentence.split("\\s+"); //Splits by one or more spaces
        int count = 0;
        for (String word : words) {
            if (isValid(word)) {
                count++;
            }
        }
        return count;
    }
 
    private boolean isValid(String word) {
        //Check for digits
        for (char c : word.toCharArray()) {
            if (Character.isDigit(c)) return false;
        }
 
        int hyphenCount = 0;
        int punctuationCount = 0;
        for (int i = 0; i < word.length(); i++) {
            char c = word.charAt(i);
            if (c == '-') hyphenCount++;
            else if (c == '!' || c == '.' || c == ',') punctuationCount++;
        }
 
        if (hyphenCount > 1 || punctuationCount > 1) return false;
 
        if (hyphenCount == 1) {
            int hyphenIndex = word.indexOf('-');
            if (hyphenIndex == 0 || hyphenIndex == word.length() - 1 || !Character.isLetter(word.charAt(hyphenIndex -1)) || !Character.isLetter(word.charAt(hyphenIndex + 1))) return false;
        }
 
        if (punctuationCount == 1 && word.charAt(word.length()-1) != '!' && word.charAt(word.length()-1) != '.' && word.charAt(word.length()-1) != ',') return false;
 
        return true;
    }
}

The other languages (C++, Go, TypeScript) follow a similar structure, adapting the string manipulation and control flow to their specific syntax. The core logic of splitting the sentence, validating individual words based on the rules, and accumulating the valid word count remains consistent across all implementations.