Alt text
  • cccc R Package
  • Functions
  • Download
  • Use Cases
  • Projects
  • References
  • About Us

On this page

  • ๐Ÿ”น Function Definition
  • ๐ŸŽฏ Purpose
  • ๐Ÿงฎ Keyword Selection Strategies
    • 1. By Frequency (type = "frequency")
    • 2. Random Selection (type = "random")
    • 3. Custom List (type = "list")
  • โš™๏ธ Arguments
  • ๐Ÿ“Š Understanding Scales
    • Fixed Scales (scales = "fixed")
    • Free Scales (scales = "free" or "free_y")
  • ๐Ÿ’ก Usage Examples
    • Basic Usage - Top Keywords per Zone
    • Random Selection
    • Custom Keyword List
  • ๐Ÿ” Interpreting Faceted Plots
    • What to Look For in Each Panel
    • Cross-Panel Comparisons
  • ๐ŸŽฏ Choosing Between Fixed and Free Scales
    • Use Fixed Scales When:
    • Use Free Scales When:
  • ๐Ÿ“ˆ Use Cases
    • 1. Zone Comparison Studies
    • 2. Pattern Discovery
    • 3. Publication Figures
    • 4. Presentation Material
    • 5. Data Exploration
    • 6. Validation
    • 7. Hypothesis Testing
  • ๐Ÿ’ก Tips & Best Practices
  • ๐Ÿ“š See Also
  • ๐Ÿ’ฌ Need Help?

facetPlot()

Create Faceted Visualizations of Keyword Trajectories by Zone

The facetPlot() function creates multi-panel plots showing keyword frequency trajectories separated by frequency zones. This faceted approach allows clear comparison of temporal patterns across different vocabulary strata without visual clutter.


๐Ÿ”น Function Definition

facetPlot(
  data,
  keyword_selection = list(type = "frequency", n = 3, kw.list = NULL),
  r = 4,
  scales = "fixed",
  leg = TRUE,
  themety = "light",
  size_class = NULL,
  x_lab = "years"
)

๐ŸŽฏ Purpose

When visualizing temporal patterns across many keywords, overlapping trajectories can become difficult to interpret. Faceting by frequency zone provides clarity by:

  1. Separating complexity โ€” Each zone displayed in its own panel
  2. Enabling comparisons โ€” Easy to compare patterns across zones
  3. Reducing clutter โ€” Keywords donโ€™t overlap across different frequency ranges
  4. Highlighting representatives โ€” Emphasize specific keywords within each zone
  5. Supporting analysis โ€” Different scales can reveal zone-specific patterns
  6. Creating clarity โ€” Clean, organized visualizations for publications
  7. Facilitating interpretation โ€” Zone-specific dynamics become immediately apparent
  8. Flexible selection โ€” Choose keywords by frequency, randomly, or manually

This function is particularly powerful for corpora with many keywords across diverse frequency ranges.


๐Ÿงฎ Keyword Selection Strategies

The keyword_selection parameter controls which keywords are highlighted in each zone:

1. By Frequency (type = "frequency")

Selects the top N most frequent keywords in each zone.

keyword_selection = list(type = "frequency", n = 3, kw.list = NULL)

Use when: - You want to highlight the most important terms per zone - Exploring core vocabulary within each frequency stratum - Creating representative visualizations

2. Random Selection (type = "random")

Randomly samples N keywords from each zone.

keyword_selection = list(type = "random", n = 3, kw.list = NULL)

Use when: - You want unbiased representation - Exploring general zone behavior - Testing methodology robustness

3. Custom List (type = "list")

Manually specify which keywords to highlight.

keyword_selection = list(type = "list", n = NULL, kw.list = c("algorithm", "data", "network"))

Use when: - You have specific keywords of interest - Creating targeted visualizations - Following up on previous analyses


โš™๏ธ Arguments

Argument Type Default Description
data List required A list object returned by importData() or normalization(), containing the TDM and corpus metadata.
keyword_selection List See below Controls keyword highlighting:
โ€ข type: "frequency", "random", or "list"
โ€ข n: Number of keywords per zone (for frequency/random)
โ€ข kw.list: Vector of specific keywords (for list type)
r Integer 4 Interval for x-axis label thinning. Shows one label every r years.
scales Character "fixed" Y-axis scale behavior:
โ€ข "fixed": Same scale across all facets
โ€ข "free": Each facet has independent scale
โ€ข "free_y": Free y-axis, fixed x-axis
leg Logical TRUE If TRUE, displays legend showing zones and highlighted keywords.
themety Character "light" Visual theme:
โ€ข "light": Light background
โ€ข "dark": Dark background
size_class Numeric vector NULL Custom line thickness for each zone. If NULL, uses theme defaults.
x_lab Character "years" Label for x-axis.

๐Ÿ“Š Understanding Scales

Fixed Scales (scales = "fixed")

Zone 4 โ”‚ [0-1000]
Zone 3 โ”‚ [0-1000]  โ† Same y-axis range
Zone 2 โ”‚ [0-1000]
Zone 1 โ”‚ [0-1000]

Advantages: - Easy to compare absolute frequencies across zones - Immediately see which zones have higher frequencies - Maintains proportional relationships

Disadvantages: - Low-frequency zones may appear flat - Details in smaller zones harder to see

Free Scales (scales = "free" or "free_y")

Zone 4 โ”‚ [500-1000]
Zone 3 โ”‚ [200-500]   โ† Each zone optimized
Zone 2 โ”‚ [50-150]
Zone 1 โ”‚ [0-50]

Advantages: - Each zoneโ€™s patterns clearly visible - Reveals details in low-frequency zones - Better for pattern analysis

Disadvantages: - Cannot compare absolute frequencies - May be misleading if not clearly labeled


๐Ÿ’ก Usage Examples

Basic Usage - Top Keywords per Zone

library(cccc)

# Import and normalize
corpus <- importData("tdm.csv", "corpus_info.csv")
corpus_norm <- normalization(corpus, normty = "nc")

# Create faceted plot with top 3 keywords per zone
facetPlot(
  corpus_norm,
  keyword_selection = list(type = "frequency", n = 3)
)

Random Selection

# Randomly select 5 keywords per zone
facetPlot(
  corpus_norm,
  keyword_selection = list(type = "random", n = 5)
)

Custom Keyword List

# Highlight specific keywords
facetPlot(
  corpus_norm,
  keyword_selection = list(
    type = "list", 
    kw.list = c("algorithm", "data", "network", "cloud", "machine", "learning")
  )
)

๐Ÿ” Interpreting Faceted Plots

What to Look For in Each Panel

High-Frequency Zone Panel

  • Core vocabulary of the corpus
  • Typically more stable trajectories
  • Changes indicate major conceptual shifts
  • Often shows gradual evolution rather than spikes

Medium-Frequency Zone Panels

  • Specialized but established terms
  • May show more dynamic patterns than high-freq zone
  • Often contains rising/falling terms
  • Balance between stability and volatility

Low-Frequency Zone Panel

  • Rare, emerging, or declining terms
  • More volatile and noisy trajectories
  • Spikes may indicate temporary importance
  • Harder to distinguish signal from noise

Cross-Panel Comparisons

Parallel trends across zones: - Indicates corpus-wide phenomena - May reflect historical events or methodological artifacts

Divergent trends: - High-freq stable while low-freq volatile = normal pattern - All zones showing similar patterns = investigate common cause

Zone-specific spikes: - High-freq spike = major event (likely real signal) - Low-freq spike = may be noise or micro-trend


๐ŸŽฏ Choosing Between Fixed and Free Scales

Use Fixed Scales When:

โœ… You want to compare absolute frequencies across zones
โœ… Showing the magnitude difference between zones is important
โœ… Creating visualizations for readers unfamiliar with faceting
โœ… Emphasizing that high-freq terms dominate the corpus

Example scenario: Demonstrating that core vocabulary (high-freq) is much more prevalent than specialized terms (low-freq).

Use Free Scales When:

โœ… You want to see patterns within each zone clearly
โœ… Low-frequency zones would be too flat with fixed scales
โœ… Analyzing temporal dynamics rather than absolute frequencies
โœ… Looking for similar patterns across different scales

Example scenario: Identifying whether low-frequency emerging terms show similar temporal patterns to established high-frequency terms.

โš ๏ธ Important: Always clearly label when using free scales to avoid misleading readers!


๐Ÿ“ˆ Use Cases

1. Zone Comparison Studies

Compare how different frequency zones evolve over time.

2. Pattern Discovery

Identify zone-specific temporal patterns not visible in combined plots.

3. Publication Figures

Create clear, organized visualizations for papers showing zone structure.

4. Presentation Material

Use free scales to show patterns clearly to audiences.

5. Data Exploration

Systematically examine keywords across frequency strata.

6. Validation

Verify that zone classification captures meaningful differences.

7. Hypothesis Testing

Test whether predicted patterns appear in expected zones.


๐Ÿ’ก Tips & Best Practices

  1. Start with frequency selection โ€” See most important terms first
  2. Try both scale types โ€” Compare fixed vs. free to understand your data
  3. Use appropriate n โ€” 2-4 keywords per zone usually optimal
  4. Save large versions โ€” Faceted plots need more space than single plots
  5. Label clearly โ€” Especially important when using free scales
  6. Match theme to context โ€” Light for papers, dark for talks
  7. Consider zone count โ€” More zones = need more vertical space
  8. Test keyword selection โ€” Try random selection to verify representativeness
  9. Document choices โ€” Note which keywords and why in methods
  10. Combine with other plots โ€” Use alongside curveCtuPlot() for comprehensive view

๐Ÿ“š See Also

  • curvePlot() โ€” Plot all keywords together without faceting
  • curveCtuPlot() โ€” Zone curves with highlighted examples (alternative approach)
  • rowMassPlot() โ€” Visualize frequency distribution by zone
  • normalization() โ€” Normalize before plotting for fair comparison
  • importData() โ€” Zone classification happens here

๐Ÿ’ฌ Need Help?

For questions about faceted visualizations: - Check the Use Cases page for complete workflow examples - Visit the Projects page to see real applications - Open an issue on GitHub - Contact the team via the About Us page

 

ยฉ 2025 The cccc Team | Developed within the RIND Project